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Abstract 

Secure  Learning  and  Learning  for  Security:  Research  in  the  Intersection 

by 

Benjamin  Rubinstein 

Doctor  of  Philosophy  in  Computer  Science 
and  the  Designated  Emphasis  in  Communication,  Computation,  and  Statistics 

University  of  California,  Berkeley 

Professor  Peter  L.  Bartlett,  Chair 


Statistical  Machine  Learning  is  used  in  many  real-world  systems,  such  as  web  search,  network 
and  power  management,  online  advertising,  hnance  and  health  services,  in  which  adversaries 
are  incentivized  to  attack  the  learner,  motivating  the  urgent  need  for  a  better  understand¬ 
ing  of  the  security  vulnerabilities  of  adaptive  systems.  Conversely,  research  in  Computer 
Security  stands  to  reap  great  benehts  by  leveraging  learning  for  building  adaptive  defenses 
and  even  designing  intelligent  attacks  on  existing  systems.  This  dissertation  contributes 
new  results  in  the  intersection  of  Machine  Learning  and  Security,  relating  to  both  of  these 
complementary  research  agendas. 

The  hrst  part  of  this  dissertation  considers  Machine  Learning  under  the  lens  of  Computer 
Security,  where  the  goal  is  to  learn  in  the  presence  of  an  adversary.  Two  large  case-studies 
on  email  spam  hltering  and  network-wide  anomaly  detection  explore  adversaries  that  ma¬ 
nipulate  a  learner  by  poisoning  its  training  data.  In  the  hrst  study,  the  False  Positive  Rate 
(FPR)  of  an  open-source  spam  hlter  is  increased  to  40%  by  feeding  the  hlter  a  training 
set  made  up  of  99%  regular  legitimate  and  spam  messages,  and  1%  dictionary  attack  spam 
messages  containing  legitimate  words.  By  increasing  the  FPR  the  adversary  affects  a  Denial 
of  Service  attack  on  the  hlter.  In  the  second  case-study,  the  False  Negative  Rate  of  a  pop¬ 
ular  network-wide  anomaly  detector  based  on  Principal  Components  Analysis  is  increased 
7-fold  (increasing  the  attacker’s  chance  of  subsequent  evasion  by  the  same  amount)  by  a 
variance  injection  attack  of  chah  traffic  inserted  into  the  network  at  training  time.  This 
high-variance  chah  traffic  increases  the  traffic  volume  by  only  10%.  In  both  cases  the  ehects 
of  increasing  the  information  or  the  control  available  to  the  adversary  are  explored;  and 
ehective  counter-measures  are  thoroughly  evaluated,  including  a  method  based  on  Robust 
Statistics  for  the  network  anomaly  detection  domain. 

The  second  class  of  attack  explored  on  learning  systems,  involves  an  adversary  aiming 
to  evade  detection  by  a  previously-trained  classiher.  In  the  evasion  problem  the  attacker 
searches  for  a  negative  instance  of  almost-minimal  distance  to  some  target  positive,  by  sub¬ 
mitting  a  small  number  of  queries  to  the  classiher.  Efficient  query  algorithms  are  developed 
for  almost-minimizing  Lp  cost  over  any  classiher  partitioning  feature  space  into  two  classes, 
one  of  which  is  convex.  For  the  case  of  a  convex  positive  class  and  p  <  1,  algorithms  with 
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linear  query  complexity  are  provided,  along  with  lower  bounds  that  almost  match;  when 
p  >  1  a  threshold  phenomenon  occurs  whereby  exponential  query  complexity  is  necessary 
for  good  approximations.  For  the  case  of  a  convex  negative  class  and  p  >  1,  a  randomized 
Ellipsoid-based  algorithm  Ends  almost-minimizers  with  polynomial  query  complexity.  These 
results  show  that  learning  the  decision  boundary  is  sufficient,  but  not  necessary  for  evasion, 
and  can  require  much  greater  query  complexity. 

The  third  class  of  attack  aims  to  violate  the  confidentiality  of  the  learner’s  training  data 
given  access  to  a  learned  hypothesis.  Mechanisms  for  releasing  Support  Vector  Machine 
(SVM)  classifiers  are  developed.  Algorithmic  stability  of  the  SVM  is  used  to  prove  that  the 
mechanisms  preserve  differential  privacy,  meaning  that  for  an  attacker  with  knowledge  of  all 
but  one  training  example  and  the  learning  map,  very  little  can  be  determined  about  the  final 
unknown  example  using  access  to  the  trained  classifier.  Bounds  on  utility  are  established 
for  the  mechanisms:  the  privacy-preserving  classifiers’  predictions  should  approximate  the 
SVM’s  predictions  with  high  probability.  In  the  case  of  learning  with  translation-invariant 
kernels  corresponding  to  infinite-dimensional  feature  spaces  (such  as  the  RBF  kernel),  a 
recent  result  from  large-scale  learning  is  used  to  enable  a  hnite  encoding  of  the  SVM  while 
maintaining  utility  and  privacy.  Finally  lower  bounds  on  achievable  differential  privacy  are 
derived  for  any  mechanism  that  well-approximates  the  SVM. 

The  second  part  of  this  dissertation  considers  Security  under  the  lens  of  Machine  Learn¬ 
ing.  The  first  application  of  Machine  Learning  is  to  a  learning-based  reactive  defense.  The 
CISO  risk  management  problem  is  modeled  as  a  repeated  game  in  which  the  defender  must 
allocate  security  budget  to  the  edges  of  a  graph  in  order  to  minimize  the  additive  profit 
or  return  on  attack  (ROA)  enjoyed  by  an  attacker.  By  reducing  to  results  from  Online 
Learning,  it  is  shown  that  the  profit /ROA  from  attacking  the  reactive  strategy  approaches 
that  of  attacking  the  best  fixed  proactive  strategy  over  time.  This  result  contradicts  the 
conventional  dogma  that  reactive  security  is  usually  inferior  to  proactive  risk  management. 
Moreover  in  many  cases,  it  is  shown  that  the  reactive  defender  greatly  outperforms  proactive 
approaches. 

The  second  application  of  Machine  Learning  to  Security  is  for  the  construction  of  an 
attack  on  open-source  software  systems.  When  an  open-source  project  releases  a  new  version 
of  their  system,  they  disclose  vulnerabilities  in  previous  versions,  sometimes  with  pointers  to 
the  patches  that  fixed  them.  Using  features  of  diffs  in  the  project’s  open-source  repository, 
labeled  by  such  disclosures,  an  attacker  can  train  a  model  for  discriminating  between  security 
patches  and  non-security  patches.  As  new  patches  land  in  the  open-source  repository,  before 
being  disclosed  as  security  or  not,  and  before  being  released  to  users,  the  attacker  can  use 
the  trained  model  to  rank  the  patches  according  to  likelihood  of  being  a  security  fix.  The 
adversary  can  then  examine  the  ordered  patches  one-by-one  until  finding  a  security  patch. 
For  an  8  month  period  of  Firefox  3’s  development  history  it  is  shown  that  an  SVM-assisted 
attacker  need  only  examine  one  or  two  patches  per  day  (as  selected  by  the  SVM)  in  order 
to  increase  the  aggregate  window  of  vulnerability  by  5  months. 


Dedicated  to  my  little  Lachlan. 
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Chapter  1 
Introduction 


‘Where  shall  I  begin,  please  your  Majesty?’  he  asked.  ‘Begin  at  the  beginning,  ’ 
the  King  said,  gravely,  ‘and  go  on  till  you  come  to  the  end:  then  stop.  ’ 

-  Lewis  Carroll 


1.1  Research  in  the  Intersection 


The  intersection  of  Machine  Learning,  Statistics  and  Security  is  ripe  for  research.  Today 
Machine  Learning  and  Statistics  are  used  in  an  ever-increasing  number  of  real-world  systems, 


advertising  ( Ciaramita  et  ah ,  2008 


hltering  (Meyer  and  Whateley,  2004 


including  web  search  (Agichtein  et  ah,  2006  Arguello  et  ah,  2009  Joachims,  2002),  online 


Ghosh  et  al. ,  2009  Immorlica  et  ah,  2005),  email  spam 


software  (Kim  and  Karp,  2004  Newsome  et  ah,  2005),  power  management  (Bodik  et  al. 


Ramachandran  et  al. ,  2007 ;  Robinson,  2003),  anti-virus 
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2003 
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0021  IMukkamala  et  al. 

2002 

Soule  et  al. , 
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2hang  et  al. ,  2005 ) 

,  hnance  (. 

A.garwal  et  al. 

2010( 

Jazan  and  Kale 

2010 

Stoltz  and  Lugosi 
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,  and  health  (Baldi  and  Brunak,  20C 

1  Brown  et  al. ,  2000 

Sankararaman  et  al. ,  2009).  In  many  of  these  systems,  human  participants  are  incentivizec 
to  game  the  system’s  adaptive  component  in  an  attempt  to  gain  some  advantage.  It  is 
important  for  the  success  of  such  applications  of  Machine  Learning  and  Statistics,  that 
practitioners  quantify  the  vulnerabilities  present  in  existing  learning  techniques,  and  have 
access  to  learning  mechanisms  designed  to  operate  in  adversarial  environments.  Viewing 
Machine  Learning  and  Statistics  through  such  a  lens  of  Computer  Security  has  the  potential 
to  yield  signihcant  impact  on  practice  and  fundamental  understanding  of  adaptive  systems. 
Indeed  for  many  application  areas  of  Machine  Learning,  Security  and  Privacy  should  be 
placed  on  the  same  level  as  more  traditional  properties  such  as  statistical  performance, 
computational  efficiency,  and  model  interpretability. 

An  equally  fruitful  exercise  is  to  study  Computer  Security  through  a  lens  of  Machine 
Learning.  In  Computer  Security  it  is  common  practice  for  researchers  to  construct  attacks 


exploiting  security  flaws  in  existing  systems  (Nature ,  2010 ).  When  the  protocol  of  responsible 
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Figure  1.1:  Organization  of  this  dissertation’s  chapters  into  two  related  parts. 


disclosure  is  followed — whereby  a  new  vulnerability  is  not  publicized  until  the  development 
team  has  had  the  opportunity  to  patch  the  affected  system — research  into  attacks  can  ben- 
eht  both  the  developers  and  legitimate  users  of  a  system.  As  cyber- criminals  are  becoming 
emboldened  by  ever-more  sophisticated  attacks,  it  is  now  necessary  for  Security  researchers 
to  consider  how  Machine  Learning  and  Statistics  might  be  leveraged  for  constructing  intelli¬ 
gent  attacks.  In  a  similar  vein,  security  practitioners  can  apply  tools  from  Machine  Learning 
to  build  effective  new  defenses  that  learn  from  complex  patterns  of  benign  and  malicious 
behavior.  In  turn,  such  adaptive  defenses  fall  under  the  umbrella  of  learning  in  adversarial 
environments  as  motivated  above. 

This  dissertation  describes  several  research  projects  in  the  intersection  of  Machine  Learn¬ 
ing,  Statistics,  Security  and  Privacy,  and  explores  questions  relating  to  each  of  the  topics 
in  the  intersection  described  above.  As  depicted  in  Figure  |1.1[  this  work  is  made  up  of 


two  parts  which  explore  Secure  Machine  Learning  and  applications  of  Machine  Learning  to 
Security  respectively. 


Chapter  Organization.  The  remainder  of  this  section  summarizes  the  main  themes  of 


the  two  parts  of  this  dissertation.  Section  1.2  summarizes  general  related  work  in  Learning, 


Statistics  and  Security,  and  Section  |1.3|  concludes  the  chapter  with  an  introduction  to  an 
important  aspect  of  threat  models  for  learning  systems — the  adversary’s  capabilities. 


1.1.1  Secure  Machine  Learning 

Understanding  the  Security  (and  Privacy)  of  Machine  Learning  methods,  and  design¬ 
ing  learners  for  adversarial  environments,  are  two  endeavors  of  vital  importance  for  the 
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Chapter 

Experimental  Theoretical 

Attacks  Defenses 

Part  1^  2 

3 

4 

• 

• 

• 

•  • 

• 

• 

Part  1^  5 

6 

• 

• 

• 

•  • 

Table  1.1:  A  classification  of  this  dissertation’s  chapters  as  being  experimental  or  theoretical 
in  nature,  and  by  the  inclusion  of  attacks  and/or  defenses. 


applicability  of  Machine  Learning  in  the  real-world. 

Examples  of  Machine  Learning  application  domains  in  which  attackers  are  incentivized 
to  exploit  learning  abound.  Content  publishers  desiring  increased  page  views  to  drive  up  ad¬ 
vertising  income,  will  undertake  black  hat  search  engine  optimization  through  participating 
in  link  farms  (Gyongyi  and  Garcia-Molina ,  2005).  Spammers  will  attempt  to  evade  Gmail 
email  hltering  by  obfuscating  the  true  nature  of  their  email  spam  messages  by  including  good 
tokens  in  their  mail  (Lowd  and  Meek,  2005a).  Insurance  companies  will  attempt  to  learn 


details  of  hospitals  patient  visits  and  conditions  through  linking  published  ‘anonymized’ 
data  or  statistics  on  private  hospital  databases  in  order  to  form  better  estimates  of  risk 
when  considering  applicants  for  health  insurance  plans  ( Rindfleisch ,  1997  Sweeney]  2002). 


These  three  examples  help  motivate  the  problems  studied  in  each  of  the  three  chapters  of 
Part  [H 

Part  [I]  of  this  dissertation  considers  both  the  security  analysis  of  Machine  Learning 
techniques  (the  so-called  ‘Security  of  Machine  Learning’)  and  wherever  possible,  learning 
techniques  that  exhibit  desirable  Security  or  Privacy  properties.  Each  chapter  within  Part[^ 
considers  a  different  class  of  attack  on  learning  systems. 


•  Chapter  This  chapter  presents  two  case-studies  on  manipulating  Machine  Learning 
systems  by  poisoning  training  data.  The  hrst  study  is  of  the  open-source  SpamBayes 
project  for  hltering  email  spam,  and  the  second  is  of  network-wide  anomaly  detection 
based  on  Principal  Gomponent  Analysis.  Both  case-studies  quantify  the  performance 
of  the  learner  in  the  presence  of  an  intelligent  attacker,  and  both  studies  evaluate 
counter-measures  for  reducing  the  effects  of  the  constructed  attacks. 


•  Chapter  Where  Chapter  considers  manipulation  of  the  training  data,  this  chap¬ 
ter  considers  manipulation  of  test  data  of  a  previously  trained  classiher  with  the  goal 
of  evading  detection.  In  this  chapter,  the  evasion  problem  is  considered  in  an  abstract 
setting  where  the  attacker  searches  for  a  minimal-cost  instance  that  will  go  undetected 
by  the  classiher,  while  submitting  only  a  small  (polynomial)  number  of  queries  to  the 
classiher.  Efficient  attack  algorithms  are  developed  for  classihers  that  partition  feature 
space  into  two  sets,  one  of  which  is  convex. 


•  Chapter  Where  the  previous  two  chapters  study  attacks  that  focus  on  manipu¬ 
lating  the  learner  or  the  learner’s  predictions,  this  chapter  considers  settings  where 
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an  adversary  may  wish  to  extract  information  abont  a  learner’s  specific  training  data 
given  access  to  the  learned  model  that  aggregates  general  statistical  information  abont 
the  data.  The  problem  of  releasing  a  trained  Support  Vector  Machine  (SVM)  clas¬ 
sifier  while  preserving  the  privacy  of  the  training  data  is  considered.  Mechanisms 
with  privacy  and  (statistical)  utility  guarantees  are  proposed,  along-side  negative  re¬ 
sults  that  bound  the  achievable  privacy  of  any  mechanism  that  discloses  an  accurate 
approximation  to  the  SVM. 


The  chapters  of  Part  vary  in  length — Chapter  being  the  largest  as  it  discusses  two 
major  case-studies — and  vary  in  being  theoretical  vs.  experimental  in  nature-while  Chap¬ 
ter]^  is  mostly  experimental  the  contributions  of  Chapters]^ and are  theoretical  in  nature. 
The  former  chapter  considers  both  attacks  and  defenses  on  learning  systems,  while  the 
later  chapters  consider  only  attacks  and  defenses  respectivelj|^  Table  ]_T  summarizes  these 
differences. 

Finally  the  research  reported  in  Chapters  and  was  joint  work  with  two  other  UC 
Berkeley  EECS  doctoral  candidates:  Marco  Barreno  and  Blaine  Nelson.  In  both  of  these 
chapters  I  provide  a  brief  summary  of  my  contributions  to  the  projects,  in  relation  to  those 
of  Barreno  and  Nelson.  I  was  the  sole/lead  graduate  student  on  the  research  reported  in 
the  remainder  of  this  dissertation. 


1.1.2  Machine  Learning  for  Security 

While  Part  |T]  considers  learning  in  the  presence  of  adversaries,  the  chapters  of  Part  [IT]  of 
this  dissertation  view  Computer  Security  through  a  lens  of  Machine  Learning.  Two  kinds  of 
opportunities  are  apparent  for  Machine  Learning  and  Statistics  in  Security  research.  First, 
statistical  models  of  a  software  system  can  be  used  to  form  effective  attacks  on  the  system. 
Second,  learning  can  be  leveraged  to  model  legitimate  and/or  malicious  behavior  so  as  to 
build  defenses  that  adapt  to  intelligent  adversaries  and  benign  data  drift.  Both  applications 
are  represented  by  the  chapters  of  Part  as  follows,  and  described  in  Table  |1.1[ 

•  Chapter  This  chapter  applies  Online  Learning  Theory,  which  follows  a  game- 
theoretic  approach  to  learning,  to  the  problem  of  risk  management.  Risk  management 
is  modeled  as  a  repeated  game  in  which  the  adversary  may  attack  a  system  to  gain  some 
profit  or  Return  on  Investment  (ROI).  The  defender’s  goal  is  to  allocate  her  defensive 
budget  to  minimize  the  attacker’s  profit  or  ROI.  A  learning-based  reactive  approach 
to  risk  management  (where  budget  is  allocated  based  on  past  attacks)  is  proposed. 
Using  a  reduction  to  results  from  Online  Learning  Theory,  the  new  reactive  defender 
is  compared  against  fixed  proactive  strategies  (where  budget  is  allocated  based  on 
estimating  risks  and  playing  a  fixed  allocation).  A  major  strength  of  the  theoretical 
comparisons  is  that  they  hold  for  all  sequences  of  attacks — no  strong  assumptions  are 
placed  on  the  adversary  by  the  analysis. 

^The  attacks  (defenses)  presented  in  Chapter  (Chapter  respectively)  are  accompanied  by  strong 
guarantees  for  the  attacker  (defender)  respectively,  obviating  the  need  for  counter-measures. 
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•  Chapter  While  Chapter  applies  machine  learning  theory  to  construct  defenses, 
this  chapter  describes  how  to  apply  practical  learning  algorithms  to  construct  attacks 
on  open-source  software  projects.  In  particular,  as  a  concrete  case-study  we  apply  the 
Support  Vector  Machine  (which  is  also  studied  under  a  different  setting  in  Chapter 
to  hnding  vulnerabilities  in  Mozilla’s  open-source  Firefox  web  browser.  Defensive 
measures  for  mitigating  the  effects  of  the  developed  attacks  are  also  proposed. 


1.2  Related  Work 

We  now  overview  past  work  that  is  of  general  relevance  to  the  topic  of  this  dissertation. 
Discussion  of  related  work  that  is  specific  to  a  single  chapter  is  deferred  to  the  particular 
chapter. 


1.2.1  Related  Tools  from  Statistics  and  Learning 

Two  entire  subhelds  of  Machine  Learning  and  Statistics  address  questions  related  to 
learning  in  the  presence  of  an  adversary  who  attempts  to  manipulate  the  learning  process  by 
poisoning  the  training  data  (the  topic  of  Chapter]^:  Robust  Statistics  and  Online  Learning 
Theory.  We  briefly  overview  these  areas  here  and  discuss  how  each  is  applied  within  this 
dissertation.  Chapter  includes  a  discussion  of  the  current  tools’  inadequacies  for  the 
unique  challenges  of  secure  learning.  The  problem  of  privacy-preserving  learning  (the  topic 
of  Chapter  is  a  third,  burgeoning  area  of  Machine  Learning  which  has  been  previously 
studied  in  Databases  (statistical  databases,  Adam  and  Worthmann||1989),  TCS  (differential 
privacy,  Dwork|2008),  and  Statistics  (data  conhdentiality  and  statistical  disclosure  control, 
Doyle  et  al.||2001  Willenborg  and  de  Waalj2001).  We  defer  discussion  of  privacy-preserving 
learning  to  Chapter 


Robust  Statistics.  The  held  of  Statistics  that  values  traditional  properties  of  estimators 
such  as  consistency  and  asymptotic  efficiency,  together  with  robustness — estimators  that  are 
not  overly  inhuenced  by  outliers — is  known  as  Robust  Statistics  (Hampel  et  ah ,  1980  Huber 


1981).  In  the  presence  of  outliers,  or  more  generally  violations  of  modeling  assumptions,  a 


robust  estimator  should  have  low  bias,  high  efficiency  and  be  asymptotically  unbiased. 

A  common  measure  of  the  robustness  of  a  statistic  is  its  breakdown  point:  the  largest 
p  G  [0,  0.5]  such  that  letting  a  fraction  p  of  the  sample  tend  to  oo  does  not  pull  the  statistic 
to  oo  as  well.  Classic  examples  of  statistics  with  good  and  bad  breakdown  points  are 
the  estimators  of  location:  the  median  and  mean  with  maximum  and  minimum  possible 
breakdown  points  of  0.5  and  0  respectively. 

One  approach  to  hnding  robust  estimators  is  via  influence  functions,  which  measure  the 
ehect  on  the  asymptotic  bias  of  an  estimator,  of  an  inhnitesimal  contamination  at  a  point. 
Robust  estimators  should  have  bounded  inhuence  functions:  the  estimator  should  not  go  to 
oo  as  the  point  diverges.  In  particular  the  inhuence  functions  of  M-estimators  (estimators 
derived  by  minimizing  the  sum  of  a  score  function  over  the  sample)  are  proportional  to 
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the  derivative  of  the  chosen  score  function,  and  so  such  estimators  can  be  designed  with 
robustness  in  mind. 


Remark  1.  We  note  in  passing,  similarities  between  Robust  Statistics  and  a  seemingly  un¬ 
related  topic  touched  on  in  this  dissertation.  Dwork  and  Lei  (2009)  demonstrated  through 


several  detailed  examples  that  robust  estimators  can  serve  as  the  basis  for  privacy-preserving 
mechanisms,  by  exploiting  the  limited  influence  of  outliers  on  robust  estimators.  Given  that 
the  typical  route  for  transforming  a  statistic  into  a  mechanism  that  preserves  differential 
privacy  is  via  the  statistic’s  sensitivity  to  data  perturbations  (Dwork  et  al. .  2006),  such 
a  connection  should  not  be  too  surprising.  Still,  finding  a  general  connection  between  ro¬ 
bustness  and  privacy  remains  an  open  problem.  We  develop  a  privacy-preserving  Support 
Vector  Machine  in  Chapter  via  algorithmic  stability  which  is  an  area  of  learning  theory 
that  exploits  smoothness  of  the  learning  map  to  yield  risk  bounds. 


In  Chapter  we  develop  data  poisoning  attacks  on  Principal  Components  Analysis 
(PCA),  a  feature  reduction  method  that  selects  features  which  capture  the  maximal  amount 
of  variance  in  a  dataset.  For  a  counter-measure  to  our  attacks,  we  turn  to  robust  versions 
of  PCA  that  maximize  alternative  robust  measures  of  scale:  the  median  absolute  devia¬ 
tion  (with  the  optimal  breakdown  point  of  0.5)  in  place  of  the  variance  (having  the  worst 
breakdown  point  of  zero).  Empirical  evaluations  of  Robust  PCA  show  good  resilience  to 
our  attacks,  cutting  the  increased  False  Negative  Rates  down  by  half. 


Online  Learning  Theory.  While  a  common  assumption  in  Machine  Learning  and  Statis¬ 
tics  is  i.i.d.  data  or  some  other  independence  assumption.  Online  Learning  Theory  considers 
learning  without  any  assumptions  on  the  data  generation  process;  the  data  need  not  even 
be  stochastic.  Online  Learning,  closely  related  to  universal  prediction,  thus  follows  a  game- 
theoretic  approach  to  learning  ( Cesa-Bianchi  and  Lugosi,  2006). 

Given  a  sequence  of  arbitrary  instances,  the  learner  predicts  labels  for  the  instances 
as  they  are  iteratively  revealed.  Upon  each  round  the  learner  incurs  some  loss,  so  that 
over  T  rounds  the  learner  accumulates  a  cumulative  loss  that  can  be  arbitrarily  bad  as 
measured  in  absolute  terms.  Instead,  Online  Learning  compares  the  learner’s  performance 
with  that  of  the  best  performing  decision  rule  or  expert  among  a  set  of  experts  that  provide 
advice  to  the  learner  throughout  the  repeated  game.  This  results  in  studying  the  regret: 
the  difference  between  the  learner’s  cumulative  loss  and  the  best  expert’s  cumulative  loss 
with  hindsight.  The  goal  of  Online  Learning  is  to  design  learning  strategies  that  achieve 
average  regret  converging  to  zero.  A  motivating  example  for  regret  is  in  online  portfolio 
optimization  ( Stoltz  and  Lugoslf  2005),  where  a  simple  goal  of  an  investor  selecting  between 
N  stocks  (the  experts)  is  to  achieve  a  portfolio  (a  random  strategy  over  the  experts)  that 
asymptotically  performs  as  well  as  the  best  stock  on  average. 

Advantages  of  this  style  of  analysis  include  guarantees  that  hold  in  a  fully  adversarial 
setting,  and  the  derived  algorithms  tend  to  be  simple  to  implement  and  very  efficient  to 
run. 

In  Chapter  we  model  the  general  security  problem  of  risk  management  as  a  repeated 
game  between  an  attacker  who  gains  prohts  depending  on  their  chosen  attack,  and  a  defender 
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who  allocates  her  security  budget  in  order  to  reduce  the  proht  enjoyed  by  the  attacker.  Via 
a  reduction  to  regret  bounds  from  Online  Learning  Theory,  we  show  that  the  performance  of 
a  learning-based  reactive  defender  who  allocates  budget  based  on  past  attacks,  achieves  the 
performance  of  the  best  proactive  defender  who  can  exploit  prior  knowledge  of  the  system’s 
vulnerabilities  to  form  minimax  allocates.  By  appealing  to  regret  bounds,  the  analysis  of 
our  learning-based  defense  has  the  advantage  that  it  allows  for  a  worst-case  attacker,  even 
one  that  has  full  knowledge  of  the  learner’s  state  and  algorithm,  from  which  it  can  form 
intelligent  attacks. 


1.2.2  Attacks  on  Learning  Systems 


Barreno  et  ah  (2006,  2010)  categorize  attacks  against  machine  learning  systems  along 


three  dimensions.  The  axes  of  their  taxonomy  are  as  follows: 


Influence 


•  Causative  attacks  influence  learning  with  control  over  training  data. 

•  Exploratory  attacks  exploit  misclassifications  but  do  not  affect  training. 


Security  violation 

•  Integrity  attacks  compromise  assets  via  false  negatives. 

•  Availability  attacks  cause  denial  of  service,  usually  via  false  positives. 


Specificity 

•  Targeted  attacks  focus  on  a  particular  instance. 

•  Indiscriminate  attacks  encompass  a  wide  class  of  instances. 

The  first  axis  of  the  taxonomy  describes  the  capability  of  the  attacker:  whether  (a) 
the  attacker  has  the  ability  to  influence  the  training  data  that  is  used  to  learn  a  model  (a 
Causative  attack)  or  (b)  the  attacker  does  not  influence  the  learner,  but  can  submit  test 
instances  to  the  learned  model,  and  observe  the  resulting  responses  (an  Exploratory  attack). 

The  second  axis  indicates  the  type  of  security  violation  caused  on  a  classifier  (where 
we  consider  malicious/benign  instances  as  belonging  to  the  positive/negative  class):  (a) 
false  negatives,  in  which  malicious  instances  slip  through  the  hlter  (an  Integrity  violation); 
or  (b)  false  positives,  in  which  innocuous  instances  are  incorrectly  filtered  (an  Availability 
violation) 

The  third  axis  refers  to  how  specific  the  attacker’s  intention  is:  whether  (a)  the  attack 
is  Targeted  to  degrade  the  learner’s  performance  on  particular  types  of  instances  or  (b)  the 
attack  aims  to  cause  the  learner  to  fail  in  an  Indiscriminate  fashion  on  a  broad  class  of 
instances. 

^Considerations  of  false  positives  or  false  negatives  apply  specifically  to  learning  for  classification,  however 
these  violations  extend  to  other  kinds  of  learning  as  well  {e.g.^  an  Integrity  attack  on  a  regression  may  aim 
to  avoid  a  certain  real-valued  response,  while  an  Availability  attack  may  aim  to  perturb  the  responses  so 
much  as  to  cause  a  DoS  attack  on  the  learner  itself). 
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Chapter 

Influence  Specificity 

Violation  ^ 

Part 

2(i) 

2(ii) 

3 

4 

Causative  Availability  ^  Targeted, 

Indiscriminate 

Causative  Integrity  Targeted 

Exploratory  Integrity  Targeted 

Causative,  Confidentiality  Targeted 

Exploratory 

Part  I 

1  5 

6 

Causative,  t  .  ^ 

„  ,  ,  Integrity  targeted 

Exploratory 

N/A  Conhdentiality  Targeted 

Table  1.2:  The  contributions  of  each  chapter  classified  by  the  taxonomy  on  attackers  on 
learning  systems  of  Barreno  et  ah  (2006).  Each  classification  is  discussed  within  the  corre¬ 
sponding  chapter. 


Table  |1.2|  classifies  the  chapters  of  this  dissertation  according  to  the  above  taxonomy. 
Chapter  develops  defensive  risk  management  strategies  (the  learner’s  task  is  more  complex 
than  classification)  with  the  goal  of  minimizing  attacker  proht  or  ROI.  The  adversary’s 
attacks  can  be  regarded  as  attempting  to  evade  allocations  of  defensive  budget,  and  so  can 
be  regarded  as  Integrity  attacks.  Chapters  and  concern  attacks  that  violate  neither 
Integrity  nor  Availability,  but  rather  Confidentiality  (a  third  security  violation  introduced 
below).  Moreover  while  Chapterj^does  not  involve  an  attack  on  a  learner  per  se,  we  consider 
the  public  release  of  patches  to  the  Firefox  open-source  project  to  be  highly  informative 
statistics  of  an  underlying  dataset  (the  patches  combined  with  their  labels  as  either  ‘security’ 
or  ‘non-security’).  Based  on  these  statistics,  our  attacks  violate  the  Conhdentiality  of  the 
data’s  undisclosed  labels. 

Related  Work:  Attacks  on  Learners.  Several  authors  have  previously  considered  at¬ 
tacks  covering  a  range  of  attack  types  described  by  the  taxonomy. 

Most  prior  work  on  attacking  learning  systems  consider  Exploratory  attacks,  in  which 
the  adversary  submits  malicious  test  points  to  a  pre-trained  classiher.  To  the  best  of  our 
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knowledge,  of  these  Exploratory  attacks  all  focus  on  Integrity  violations  that  cause  false 
negatives.  Lowd  and  Meek  (2005b)  consider  an  abstract  formulation  of  what  we  call  the 


evasion  problem,  in  which  the  attacker  wishes  to  make  minimal-cost  alterations  to  a  positive 
instance  {e.g.,  an  email  spam  message)  such  that  the  modihed  instance  is  labeled  negative 
by  classiher  (the  modihed  message  reaches  the  victim’s  inbox).  They  derive  query-based 
algorithms  for  Boolean  and  linear  classihers;  in  the  latter  case  their  algorithms  not  only 
hnd  instances  that  evade  detection,  but  in-so-doing  learn  the  classiher’s  decision  boundary. 
In  Chapter  we  generalize  their  work  to  classihers  that  partition  feature  space  into  two 
sets,  one  of  which  is  convex.  For  this  larger  family  of  classihers,  learning  the  decision 


boundary  is 

known  to  be  NP-hard 

)Dyer  and  Frieze, 

1992 

Rademacher  and  Goyal 

2009 

).  In 

earlier  work 

Tan  et  al.|  ( 

2002 

)  and 

Wagner  and  Soto 

(2002 

)  independently  designed  mimicry 

attacks  for  evading  sequence-based  intrusion  detection  systems  (IDSs).  By  analyzing  the 
IDS  offline,  they  modify  exploits  to  mimic  benign  behavior  not  detected  by  the  IDS.  |FogIa 


and  Lee  (2006)  design  polymorphic  blending  attacks  on  IDSs  that  encrypt  malicious  trafflc 


to  become  indistinguishable  from  innocuous  trafflc.  By  contrast  our  algorithms  for  evading 
convex-inducing  classihers  searches  by  querying  the  classiher  online.  In  the  email  spam 
domain  Wittel  and  Wu  (2004)  and  Lowd  and  Meek  (2005a)  consider  good  word  attacks  that 
add  common  words  to  spam  messages  to  allow  them  to  pass  through  an  email  spam  hlter; 
in  an  alternate  approach  Karlberger  et  ah  (2007)  replace  tokens  in  spam  messages  that  have 
strong  spam  scores  with  synonyms.  In  the  realm  of  counter-measures  designed  specihcally 
for  Exploratory  attacks,  Dalvi  et  ah  (2004)  consider  an  optimal  cost-based  defense  for  naive 
Bayes  for  a  rational,  omniscient  attacker.  In  Chapter  we  apply  Online  Learning  Theory 
to  design  a  learning-based  reactive  risk  management  strategy,  which  faces  an  adversary 
whose  attacks  may  try  to  side-step  the  defensive  budget  allocations  to  the  system.  Since 
our  performance  guarantees  for  the  reactive  strategy  are  derived  under  extremely  weak 
conditions  on  the  adversary,  we  can  guarantee  that  over  time  the  success  of  such  Exploratory 
attacks  is  limited. 

A  relatively  small  number  of  prior  studies  have  investigated  Causative  attacks,  in  which 
the  adversary  manipulates  the  learner  by  poisoning  its  training  data  (although  the  areas 
of  Online  Learning  and  Robust  Statistics  both  address  learning  under  such  attacks,  in  spe- 
cihc  settings).  Newsome  et  ah  (2006)  study  red  herring  Causative  Integrity  attacks  on  the 


Polygraph  polymorphic  work  detector  (Newsome  et  al. ,  2005),  that  aim  to  increase  false  neg¬ 
atives.  Their  attacks  include  spurious  features  in  positive  training  examples  (worms)  so  that 
subsequent  malicious  instances  can  evade  being  detected  by  the  conjunction  learner,  which 
copes  poorly  with  high  levels  of  irrelevant  features.  The  authors  also  consider  correlated 
outlier  Exploratory  Availability  attacks  which  mis-train  the  learner  into  blocking  benign 


trafflc.  Chung  and  Mok  (2006,  2007)  send  trafflc  to  the  Autograph  worm  detector  (Kim  and 


Karp,  2004)  that  is  flagged  as  malicious.  Subsequent  legitimate-looking  trafflc  sent  from 


the  same  node  result  in  rules  that  block  similar  trafflc  patterns,  including  truly  legitimate 


trafflc. 

Kearns  and  Li 

( 

1993 

)  consider  the  Probably  Approximately  Correct  (PAC)  model  of 

learning  ( 

Valiant 

1984 

),  in 

the  presence  of  an  adversary  that  manipulates  a  portion  of  the 

training  data.  In  this  t 

leoretical  work  the  authors  bound  the  classihcation  error  in  terms  of 

the  level  of  malicious  noise,  and  bound  the  maximum  level  of  noise  tolerable  for  learnabil- 
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ity.  In  Chapter  we  consider  Causative  Availability  and  Causative  Integrity  attacks  on  the 
SpamBayes  email  spam  hlter,  and  the  PCA-based  network  anomaly  detector  respectively. 
The  former  case-study  is  the  hrst  Causative  Availability  attack  of  its  kind,  while  the  second 
study  explores  a  system  of  recent  popularity  in  the  Systems  and  Measurement  communi¬ 
ties.  In  both  cases  we  consider  effective  counter-measures  for  our  attacks,  including  one 
that  uses  Robust  Statistics.  Finally,  relative  to  all  hxed  proactive  strategies,  our  reactive 
risk  management  strategy  presented  in  Chapter  can  handle  both  Exploratory  attacks  and 
Causative  attacks  by  virtue  of  the  worst-case  nature  of  our  analysis. 


The  Third  Security  Violation:  Confidentiality.  The  taxonomy  of  Barreno  et  ah 


(2006)  describes  the  attacker’s  goals  through  the  violation  and  specihcity  axes,  broadly 
covering  most  situations  where  the  attacker  wishes  to  manipulate  a  learner  or  its  predictions. 
However,  as  is  apparent  from  an  increasing  body  of  research  on  the  privacy  of  statistical 
estimators  (Dwork,  2010)  a  third  security  violation  that  can  be  achieved  by  an  adversary 


attacking  an  adaptive  system  may  be  one  of  confidentiality.  The  International  Organization 


for  Standardization  (2005)  dehnes  conhdentiality  as  “ensuring  that  information  is  accessible 
only  to  those  authorized  to  have  access”,  and  indeed  conhdentiality  is  a  corner-stone  of 
information  security  along-side  integrity  and  availability  (Wikipedia,  2010).  In  particular 
conhdentiality  should  be  considered  as  a  general  kind  of  security  violation,  and  should 
include  attacks  that  reveal  information  about  the  learner’s  state  (Barreno,  2008),  parameters 
to  the  learner,  or  the  learner’s  training  data. 

Numerous  authors  have  studied  Conhdentiality  attacks  on  statistical  databases  that 
release  statistics,  learned  models,  or  even  anonymized  forms  of  the  data  itself.  A  common 
form  of  real-world  attack  on  statistical  databases  is  to  exploit  side  information  available 
via  public  data  covering  overlapping  features  and  rows  of  the  database.  Sweeney  (2002) 


used  public  voter  registration  records  to  identify  individuals  (including  the  Governor)  in  an 
‘anonymized’  database  of  hospital  records  of  Massachusetts  state  employees  where  patient 
names  had  been  removed.  In  a  similar  case  Narayanan  and  Shmatikov  (2008 )  identihed  users 
in  an  ‘anonymized’  movie  rating  dataset  released  by  Nethix  for  their  collaborative  hltering 
prize  competition.  Given  a  small  subset  of  movies  a  customer  had  watched,  acting  as  a 
unique  kind  of  signature  for  the  customer,  their  attack  accurately  identihes  the  customer’s 
ratings  in  the  dataset.  They  applied  their  attack  using  publicly  available  movie  ratings  on 
IMDB  to  identify  Nethix  customers  and  their  previously-private  tastes  in  movies. 

The  key  problem  with  removing  only  explicitly  personal  information  such  as  names  from 
a  released  database,  is  that  seemingly  innocuous  features  can  implicitly  identify  individuals 
when  used  together.  Sweeney  (2002)  showed  in  her  study  that  gender,  postal  code  and 
birthdate  is  enough  to  uniquely  identify  87%  of  the  U.S.  population.  An  early  real-world 
Gonhdentiality  violation  using  this  same  idea  was  reported  by  the  New  York  Times,  when 
journalists  Barbaro  and  Zeller  Jr.  (2006)  identihed  AOL  members  from  very  specihc  queries 
included  in  an  ID-scrubbed  search  log  released  by  AOL  research. 

A  number  of  theoretical  Gonhdentiality  attacks  have  been  designed  in  past  work,  demon¬ 


strating  the  fundamental  privacy  limits  of  statistical  database  mechanisms.  Dinur  and  Nis- 


sim 


(2003 )  show  that  if  noise  of  rate  only  o{y/n)  is  added  to  subset  sum  queries  on  a  database 
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of  n  bits  then  an  adversary  can  reconstruct  a  1  —  o(l)  fraction  of  the  database.  This  is  a 
threshold  phenomenon  that  says  if  accuracy  is  too  high,  no  level  of  privacy  can  be  guaran¬ 
teed.  Dwork  and  Yekhanin  (2008)  construct  realistic,  acute  attacks  in  which  only  a  fixed 


number  of  queries  is  made  for  each  bit  revealed.  In  a  similar  vein  we  show  negative  results 
for  the  privacy-preserving  Support  Vector  Machine  (SVM)  setting  in  Chapter  where  any 
mechanism  that  is  too  accurate  with  respect  to  the  SVM,  cannot  guarantee  high  levels  of 
privacy. 

Deriving  defenses  for  Confidentiality  attacks  on  learners  is  an  active  area  of  research  in 
Databases,  Machine  Learning,  Security,  Statistics  and  TCS  (Dwork,  2008, 2010).  An  increas¬ 
ingly  popular  guarantee  of  data  privacy  is  provided  by  differential  privacy  (Dwork,  2006). 
We  provide  the  definition  and  technical  details  of  differential  privacy  in  Chapter  however 
the  intuition  is  that  even  a  powerful  attacker  with  full  knowledge  of  all  but  one  row  in  a 
databases,  the  workings  of  the  statistical  database  mechanism,  and  access  to  responses  from 
the  mechanism,  cannot  reconstruct  the  final  database  row.  Differentially  private  versions  of 
several  statistics  and  learning  algorithms  have  thus  far  been  developed,  including:  contin¬ 
gency  tables  (Barak  et  ah  [2007 ),  histograms.  Principal  Component  Analysis,  k  means,  IDS, 


the  perceptron  algorithm  (Blum  et  ah  ,  2005),  regularized  logistic  regression  (Chaudhuri  and 


Monteleoni,  2009),  query  and  click  count  logs  (Korolova  et  al.|  2009),  degree  distributions 
of  graphs  (Hay  et  ^|2009[),  and  several  recommender  systems  that  were  used  in  the  Netfiix 
prize  contest  (McSherry  and  Mironov,  2009).  We  derive  privacy-preserving  mechanisms  for 
SVM  learning  in  Chapter 


1.3  The  Importance  of  the  Adversary’s  Capabilities 


As  discussed  with  respect  to  the  taxonomy  above,  a  crucial  step  in  protecting  against 
threats  on  Machine  Learning  systems  is  to  understand  the  threat  model  in  adversarial  learn¬ 
ing  domains.  The  threat  model  can  broadly  be  described  as  the  attacker’s  goals  and  ca¬ 
pabilities.  Typical  attacker  goals  are  well-represented  by  the  taxonomy  of  Barreno  et  ah 


(2006)  described  above.  However  this  taxonomy  considers  the  adversary’s  capabilities  at 
the  coarsest  level  as  either  being  able  to  manipulate  the  training  and/or  test  data.  A  finer 
grained  analysis  of  adversarial  capabilities  may  consider  the  level  of  information  and  the 
level  of  control  possessed  by  the  adversary. 


Definition  2.  Adversarial  information  is  the  adversary’s  knowledge  of  the  learning  system 
and  environment,  such  as  the  learner’s  features,  the  learning  algorithm,  the  current  decision 
function,  the  policy  for  training  and  retraining,  and  the  benign  data  generation  process. 


Definition  3.  Adversarial  control  is  the  extent  of  the  attacker’s  control  over  the  learning 
system’s  training  and/or  test  data. 

A  number  of  examples  illustrating  the  roles  of  adversarial  information  and  control  now 
follow. 


Example  4.  In  email  spam  filtering,  relevant  adversarial  information  may  include  the  user’s 
language,  common  types  of  email  the  user  receives,  which  spam  filter  the  user  has,  and  the 
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particular  training  corpus  or  distribution  used  to  create  the  spam  filter  (or  knowledge  of  a 
similar  distribution) .  Adversarial  control  may  include  choosing  the  bodies  of  a  fraction  of 
emails  (perhaps  only  spam),  controlling  email  headers  directly  or  indirectly,  and  controlling 
how  the  user  receives  messages.  This  control  could  be  exerted  over  messages  used  for  training 
or  for  run-time  testing. 


Example  5.  In  network-wide  traffic  anomaly  detection  {Lakhina  et  al,  200 fa),  adversarial 


information  may  include  the  network  topology,  routing  tables,  real-time  traffic  volumes  along 
one  or  more  links,  historical  traffic  along  one  or  more  links,  and  the  training  policies  of  the 
anomaly  detection  system.  Adversarial  control  may  include  controlling  one  or  more  links 
to  give  false  traffic  reports  or  compromising  one  or  more  routers  to  inject  chaff  into  the 
network. 


Example  6.  In  the  domain  of  phishing  webpage  detection,  adversarial  information  may 
include  user  language  and  country,  email  client,  web  browser,  financial  institution,  and  em¬ 
ployer.  Adversarial  control  may  include  choosing  the  content  and/or  headers  of  the  phishing 


emails  and  potentially  influencing  training  datasets  of  known  phishing  sites,  such  as  Phish- 


Tank  (2010). 


In  the  sequel,  assumptions  on  adversarial  information  and  control  will  be  made  explicit. 
An  interesting  and  important  research  direction  is  to  consider  analyses  that  quantify  the 
value  of  the  information  and  control  available  to  the  adversary  for  attacks  against  learning 
systems.  We  revisit  such  open  question  in  Chapter 

Chapter  considers  case  studies  in  Causative  Integrity  and  Availability  attacks  on  clas- 
sihers  with  special  attention  paid  to  the  effects  of  increasing  adversarial  information  or 
control.  Chapter  develops  Exploratory  Integrity  attacks  on  classihers  where  bounds  are 
derived  on  the  number  of  query  test  points  required  by  the  attacker:  in-turn  these  corre¬ 
spond  to  the  amount  of  control  the  adversary  has  over  the  test  data.  Finally  the  attacks  on 
Firefox  constructed  in  Chapter  either  utilize  meta-data  about  commits  to  an  open-source 
repository  or  are  oblivious  to  the  commit  details.  Once  again,  this  corresponds  to  the  level 
of  information  available  to  the  attacker.  Chapters  and  grant  signihcant  amounts  of 
information  and  control  to  the  adversary,  as  they  derive  results  in  worst-case  settings. 
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Part  I 

Private  and  Secure  Machine  Learning 
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Chapter  2 

Poisoning  Classifiers 


Expect  poison  from  standing  water. 

-  William  Blake 


Statistical  Machine  Learning  techniques  have  recently  garnered  increased  popularity  as 


a  means  to  filter  email  spam 

( Ramachandran  et  al. 

2007 

f 

lobinson 

2003 

)  and  improve 

network  design  and  security  (. 

3ahl  et  al. 

2007 

Cheng  et  al. 

,  2007 

k 

fandula  et  al. 

2008 

Lazarevic  et  ah,  2003),  as  learning  techniques  can  adapt  to  specifics  of  an  adversary’s  be¬ 


havior.  However  using  Statistical  Machine  Learning  for  making  security  decisions  introduces 
new  vulnerabilities  in  large-scale  systems  due  to  this  very  adaptability.  This  chapter  devel¬ 
ops  attacks  that  exploit  Statistical  Machine  Learning,  as  used  in  the  SpamBayes  email  spam 


filter  (Meyer  and  Whateley,  2004  Robinson,  2003)  and  the  Principal  Component  Analy¬ 


sis  (PCA)-subspace  method  for  detecting  anomalies  in  backbone  networks  (Lakhina  et  al. 


2004a). 


In  the  language  of  the  taxonomy  of  Barreno  et  al.  (2006)  (c/.  Section  1.2.2),  the  attacks 


of  this  chapter  are  case-studies  in  Causative  attacks:  the  adversary  influences  the  classifier 
by  manipulating  the  learner’s  training  data.  The  attacks  on  SpamBayes  are  Availability 
attacks  in  that  they  aim  to  increase  the  false  positive  rate  or  availability  of  the  spam  filter 
(constituting  a  DoS  attack  on  the  learning  component  itself).  By  contrast  the  presented 
attacks  on  PCA  are  Integrity  attacks  that  aim  to  increase  the  false  negative  rate  or  chance 
of  evasion.  Finally  the  attacks  on  PCA  are  Targeted  in  that  they  facilitate  specific  false 
negatives.  Both  Indiscriminate  and  Targeted  attacks  on  SpamBayes  are  presented,  and 
special  attention  is  paid  to  the  adversary’s  capabilities,  with  attacks  exploiting  a  range  of 
adversarial  information  and  control  compared  experimentally  throughout. 

As  the  presented  attacks  highlight  and  quantify  the  severity  of  existing  vulnerabili¬ 
ties  in  the  SpamBayes  and  PCA-based  systems,  it  becomes  necessary  to  design  defensive 
approaches  that  are  less  susceptible  to  tampering.  In  both  of  the  training  data  poisoning 
case-studies,  counter-measures  are  proposed  that  reduce  the  effects  of  the  presented  attacks. 


The  SpamBayes  case-study  of  Section  2.2  presents  joint  work  with  UCB  EECS  doctoral 
candidates  Marco  Barreno  and  Blaine  Nelson.  In  that  study  I  contributed  equally  to  the 
design  of  the  attacks  on  SpamBayes,  while  Barreno  and  Nelson  were  responsible  for  their 
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implementation  and  the  experimental  analysis.  In  Section  2^  s  case-study  on  PCA-based 
anomaly  detection  I  was  the  lead  graduate  student,  in  joint  work  with  Nelson.  While  we 
equally  contributed  to  the  design  of  the  attacks  and  the  defenses,  I  was  responsible  for  the 
implementation  and  experimental  analysis. 


2.1  Introduction 


Applications  use  Statistical  Machine  Learning  to  perform  a  growing  number  of  critical 
tasks  in  virtually  all  areas  of  computing.  The  key  strength  of  Machine  Learning  is  adapt¬ 
ability;  however,  this  can  become  a  weakness  when  an  adversary  manipulates  the  learner’s 
environment.  With  the  continual  growth  of  malicious  activity  and  electronic  crime,  the  in¬ 
creasingly  broad  adoption  of  learning  makes  assessing  the  vulnerability  of  learning  systems 
to  manipulation  an  essential  problem. 

The  question  of  robust  decision  making  in  systems  that  rely  on  Machine  Learning  is  of 
interest  in  its  own  right.  But  for  security  practitioners,  it  is  especially  important,  as  a  wide 
swath  of  security-sensitive  applications  build  on  Machine  Learning  technology,  including 


intrusion  detection  systems,  virus  and  worm  detection  systems,  and  spam  Liters  (Bahl  et  al. 


2007  Cheng  et  al. 

2007  i  Kandula  et  al. ,  2008  Lakhina  et  al. ,  2004ala 

Lazarevic  et  al. 

2003 

Liao  and  Vemuri 

20021  Meyer  and  Whateley,  2004; 

Mukkamala  et  al. 

2002  Newsome  et  al. 

2005[  Ramachandran  et  al. ,  |2007 ;  Robinson ,  2003 

Soule  et  al. ,  2005[  Stolfo  et  al. 

2006 

Zhang  et  al. ,  2005 ).  These  solutions  draw  upon  a  variety  of  techniques  from  the  SML  domain 


including  Singular  Value  Decomposition,  clustering,  Bayesian  inference,  spectral  analysis, 
maximum-margin  classihcation,  etc.;  and  in  many  scenarios,  these  approaches  have  been 
demonstrated  to  perform  well  in  the  absence  of  Causative  attacks  on  the  learner. 

Past  Machine  Learning  research  has  often  proceeded  under  the  assumption  that  learning 
systems  are  provided  with  training  data  drawn  from  a  natural  distribution  of  inputs.  Such 
techniques  have  a  serious  vulnerability,  however,  as  in  many  real-world  applications  an 
attacker  has  the  ability  to  provide  the  learning  system  with  maliciously  chosen  inputs  that 
cause  the  system  to  infer  poor  classihcation  rules.  In  the  spam  domain,  for  example,  the 
adversary  can  send  carefully  crafted  spam  messages  that  a  human  user  will  correctly  identify 
and  mark  as  spam,  but  which  can  inhuence  the  underlying  learning  system  and  adversely 
affect  its  ability  to  correctly  classify  future  messages.  A  similar  scenario  is  conceivable  for 
the  network  anomaly  detection  domain,  where  an  adversary  could  carefully  inject  traffic 
into  the  network  so  that  the  detector  mis-learns  its  model  of  normal  traffic  patterns. 

This  chapter  explores  two  in-depth  case-studies  into  Causative  attacks,  and  correspond¬ 
ing  counter-measures,  against  Machine  Learning  systems.  The  hrst  case-study  considers  the 
email  spam  filtering  problem,  where  the  attacker’s  goal  is  to  increase  False  Positive  Rates 
as  a  denial-of-service  attack  on  the  learner  itself.  The  second  case-study  explores  a  problem 
in  network  anomaly  detection  where  the  adversary’s  goal  for  poisoning  is  to  increase  the 
False  Negative  Rate  so  that  subsequent  attacks  through  the  network  can  go  un-detected. 
Throughout  the  chapter,  the  key  roles  of  the  adversary’s  capabilities  of  information  and 
control  are  highlighted,  and  their  effect  on  the  attacks’  damage  measured. 
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Chapter  Organization.  This  section  is  completed  with  a  survey  of  previous  work  related 


to  poisoning  learners.  Section  2.2  describes  in  detail  a  Causative  attack  case-study  on  email 


spam  hltering,  while  Section  2.3  details  a  case-study  on  network  anomaly  detection.  Finally 


the  chapter  is  concluded  with  a  summary  of  its  main  contributions  in  Section  2.4 


2.1.1  Related  Work 

Many  authors  have  examined  adversarial  learning  from  a  theoretical  perspective.  For 


example,  within  the  Probably  Approximately  Correct  framework,  Kearns  and  Li  (1993) 


bound  the  classihcation  error  an  adversary  that  has  control  over  a  fraction  (3  of  the  training 


set  can  cause.  Dalvi  et  ah  (2004)  apply  game  theory  to  the  classihcation  problem:  they  model 


interactions  between  the  classiher  and  attacker  as  a  game  and  hnd  the  optimal  counter¬ 
strategy  for  the  classiher  against  an  optimal  opponent.  The  learning  theory  community  has 


focused  on  online  learning  ( Cesa-Bianchi  and  Lugosi,  2006),  where  data  is  selected  by  an 


adversary  with  complete  knowledge  of  the  learner,  and  has  developed  efficient  algorithms 
with  strong  guarantees.  However,  the  simplifying  assumption  of  all  data  being  produced 
by  an  omniscient  adversary  does  not  hold  for  many  practical  threat  models.  Given  the 
increasing  popularity  of  SML  techniques,  we  believe  exploring  adversarial  learning  with 
realistic  threat  models  is  important  and  timely. 

A  handful  of  studies  have  considered  Causative  attacks  on  SML-based  systems.  [Newsomn 


et  al.  (2006)  present  attacks  against  Polygraph  (Newsome  et  ah,  2005),  a  polymorphic 


virus  detector  that  uses  Machine  Learning.  They  suggest  a  correlated  outlier  attack,  which 
attacks  a  naive-Bayes-like  learner  by  adding  spurious  features  to  positive  training  instances, 
causing  the  hlter  to  block  benign  traffic  with  those  features  (a  Causative  Availability  attack). 
Focusing  on  conjunction  learners,  they  present  Causative  Integrity  red  herring  attacks  that 
again  include  spurious  features  in  positive  training  examples  so  that  subsequent  malicious 
instances  can  evade  detection  by  excluding  these  features.  Our  attacks  use  similar  ideas, 
but  we  develop  and  test  them  on  real  systems  in  other  domains  and  we  also  explore  the 
value  of  information  and  control  to  an  attacker,  and  we  present  and  test  defenses  against 


the  attacks.  Venkataraman  et  al.  (2008)  present  lower  bounds  for  learning  worm  signatures 


based  on  red  herring  attacks  and  reductions  to  classic  results  from  Query  Learning.  Chung 


and  Mok  (2006,  2007)  present  a  Causative  Availability  attack  against  the  earlier  Autograph 


worm  signature  generation  system  (Kim  and  Karp,  2004),  which  infers  blocking  rules  based 


on  patterns  observed  in  traffic  from  suspicious  nodes.  The  main  idea  is  that  the  attack  node 
hrst  sends  traffic  that  causes  Autograph  to  mark  it  suspicious,  then  sends  traffic  similar  to 
legitimate  traffic,  resulting  in  rules  that  cause  a  denial  of  service. 

Most  existing  attacks  against  content-based  spam  Liters  in  particular  are  Exploratory 
attacks  that  do  not  influence  training  but  engineer  spam  messages  so  they  pass  through  the 
hlter.  For  example,  Lowd  and  Meek  (2005a||b)  explore  reverse-engineering  a  spam  classifier 
to  hnd  high-value  messages  that  the  hlter  does  not  block,  Karlberger  et  al.  (2007)  study  the 


ehect  of  replacing  strong  spam  words  with  synonyms,  and  Wittel  and  Wu  (2004)  study  the 


ehect  of  adding  common  words  to  spam  to  get  it  through  a  spam  hlter.  Another  Exploratory 


attack  is  the  polymorphic  blending  attack  of  Fogla  and  Lee  (2006 ),  which  encrypts  malicious 
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traffic  so  that  the  traffic  is  indistinguishable  from  innocuous  traffic  to  an  intrusion  detection 
system.  By  contrast  our  variance  injection  attacks  add  small  amounts  of  high-variance 
chaff  traffic  to  PCA’s  training  data,  to  make  the  data  appear  more  like  future  DoS  attacks. 
We  return  to  the  problem  of  evading  a  classifier  with  carefully  crafted  test  instances  in 
Chapter 


Finally  Ringberg  et  ah  (2007)  performed  a  study  of  the  sensitivities  of  the  PCA-based 

that  illustrates  how  the  PCA  method  can  be 


detection  method  studied  in  Section  2.3 


sensitive  to  the  number  of  principal  components  used  to  describe  the  normal  subspace.  This 
parameter  can  limit  PCA’s  effectiveness  if  not  properly  configured.  They  also  show  that 
routing  outages  can  pollute  the  normal  subspace;  a  kind  of  perturbation  to  the  subspace 
that  is  not  adversarial.  Our  work  differs  in  two  key  ways.  First  we  demonstrate  a  different 
type  of  sensitivity,  namely  that  of  data  poisoning.  This  adversarial  perturbation  can  be 
stealthy  and  subtle,  and  is  more  challenging  to  circumvent  than  observable  routing  outages. 


Second,  Ringberg  et  al.  (2007)  focus  on  showing  the  variability  in  PCA’s  performance  to 


certain  sensitivities,  and  not  on  defenses.  In  our  work,  we  propose  a  robust  defense  against 
a  malicious  adversary  and  demonstrate  its  effectiveness.  It  is  conceivable  that  the  technique 
we  propose  could  help  limit  PCA’s  sensitivity  to  routing  outages,  although  such  a  study 
is  beyond  the  scope  of  this  work.  A  recent  study  (Brauckhoff  et  ah,  2009)  showed  that 


the  sensitivities  observed  by  Ringberg  et  al.  (2007)  come  from  PCA’s  inability  to  capture 
temporal  correlations.  They  propose  to  replace  PCA  by  a  Karhunen-Loeve  expansion.  Our 
study  indicates  that  it  would  be  important  to  examine,  in  future  work,  the  data  poisoning 
robustness  of  this  proposal. 


2.2  Case-Study  on  Email  Spam 

This  section  demonstrates  how  attackers  can  exploit  Machine  Learning  to  subvert  spam 
filters.  Our  attack  strategies  exhibit  two  key  differences  from  previous  work:  traditional 
attacks  modify  spam  emails  to  evade  a  spam  hlter,  whereas  our  attacks  interfere  with  the 
training  process  of  the  learning  algorithm  and  modify  the  filter  itself]  and  rather  than  focus 
only  on  placing  spam  emails  in  the  victim’s  inbox,  we  subvert  the  spam  filter  to  remove 
legitimate  emails  from  the  inbox  (see  the  theses  of  Barreno|2008  and  Saini  2008  for  poisoning 
attacks  that  cause  spam  to  evade  filtering). 

An  attacker  may  have  one  of  two  goals:  expose  the  victim  to  an  advertisement  or  prevent 
the  victim  from  seeing  a  legitimate  message.  Potential  revenue  gain  for  a  spammer  drives 
the  first  goal,  while  the  second  goal  is  motivated,  for  example,  by  an  organization  competing 
for  a  contract  that  wants  to  prevent  competing  bids  from  reaching  their  intended  recipient. 


Tying  in  with  adversarial  information  (c/.  Section  1.3),  an  attacker  may  have  detailed 


knowledge  of  a  specific  email  the  victim  is  likely  to  receive  in  the  future,  or  the  attacker 
may  know  particular  words  or  general  information  about  the  victim’s  word  distribution.  In 
many  cases,  the  attacker  may  know  nothing  beyond  which  language  the  emails  are  likely  to 


use. 


When  an  attacker  wants  the  victim  to  see  spam  emails,  a  broad  dictionary  attack  can 


render  the  spam  filter  unusable,  causing  the  victim  to  disable  the  filter  (c/.  Section  2. 2. 2. 2). 
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With  more  information  about  the  email  distribution,  the  attacker  can  select  a  smaller  dic¬ 
tionary  of  high-value  features  that  are  still  effective.  When  an  attacker  wants  to  prevent 
a  victim  from  seeing  particular  emails  and  has  some  information  about  those  emails,  the 
attacker  can  target  them  with  a  focused  attack  {cf.  Section  2. 2. 2. 3). 

We  demonstrate  the  potency  of  these  attacks  and  then  present  two  defenses.  The  Re¬ 
ject  On  Negative  Impact  (RONI)  defense  tests  the  impact  of  each  email  on  training  and 
doesn’t  train  on  messages  that  have  a  large  negative  impact.  The  dynamic  threshold  de¬ 
fense  dynamically  sets  the  spam  hlter’s  classihcation  thresholds  based  on  the  data  rather 
than  using  SpamBayes’  static  choice  of  thresholds.  We  show  that  both  defenses  are  effective 
in  preventing  some  attacks  from  succeeding. 

We  focus  on  the  learning  algorithm  underlying  several  spam  hlters,  including  SpamBayes 
(spambayes.sourceforge.net),  BogoFilter  (bogohlter.sourceforge.net),  and  the  machine  learn¬ 
ing  component  of  SpamAssassin  (spamassassin.apache.org)  We  target  the  open-source 
SpamBayes  system  because  it  uses  a  pure  machine  learning  method,  it  is  familiar  to  the 
academic  community  (Meyer  and  Whateley,  2004),  and  it  is  popular,  with  over  1,800,000 
downloads.  Although  we  specihcally  attack  SpamBayes,  the  widespread  use  of  its  statistical 
learning  algorithm  suggests  that  other  hlters  may  also  be  vulnerable  to  similar  attacks. 

Our  experimental  results  conhrm  that  this  class  of  attacks  presents  a  serious  concern  for 
statistical  spam  hlters,  when  the  adversary  has  only  limited  control  over  the  learner  (again, 
tying  back  to  the  importance  of  adversarial  capabilities  cf.  Section [L3|.  A  dictionary  attack 
can  make  a  spam  hlter  unusable  when  controlling  just  1%  of  the  messages  in  the  training 
set,  and  a  well-informed  focused  attack  can  remove  the  target  email  from  the  victim’s  inbox 
90%  of  the  time.  Of  our  two  defenses,  one  signihcantly  mitigates  the  ehect  of  the  dictionary 
attack  and  the  other  provides  insight  into  the  strengths  and  limitations  of  threshold-based 
defenses. 


2.2.1  Background  on  Email  Spam  Filtering 

We  now  briehy  overview  common  learning  models  for  email  spam  hltering,  and  detail 
the  SpamBayes  learning  algorithm. 

2. 2. 1.1  Training  model 

SpamBayes  produces  a  classiher  from  labeled  examples  to  predict  the  true  class  of  fu¬ 
ture  emails.  The  labels  used  by  SpamBayes  consist  of  spam  (bad,  unsolicited  email),  ham 
(good,  legitimate  email),  and  unsure  (SpamBayes  isn’t  conhdent  one  way  or  the  other).  The 
classiher  learns  from  a  labeled  training  set  or  corpus  of  ham  and  spam  emails. 

Email  clients  (applications  for  viewing  and  manipulating  email  messages)  use  these  labels 
in  different  ways — some  clients  hlter  email  labeled  as  spam  and  unsure  into  “Spam-High” 
and  “Spam-Low”  folders,  respectively,  while  other  clients  only  hlter  email  labeled  as  spam 
into  a  separate  folder.  Since  the  typical  user  reads  most  or  all  email  in  their  inbox  and 

^The  primary  difference  between  the  learning  elements  of  these  three  hlters  is  in  their  tokenization 
methods. 
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rarely  (if  ever)  looks  at  the  spam/spam-high  folder,  the  unsure  labels  can  be  problematic.  If 
unsure  messages  are  hltered  into  a  separate  folder,  nsers  may  periodically  read  the  messages 
in  that  folder  to  avoid  missing  important  email.  If  instead  unsure  messages  are  not  hltered, 
then  the  nser  is  confronted  with  those  messages  when  checking  the  email  in  their  inbox. 
Too  mnch  unsure  email  is  almost  as  tronblesome  as  too  many  false  positives  (ham  labeled 
as  spam)  or  false  negatives  (spam  labeled  as  ham).  In  the  extreme,  if  everything  is  labeled 
unsure  then  the  user  obtains  no  time  savings  at  all  from  the  hlter. 

In  our  scenarios,  an  organization  uses  SpamBayes  to  hlter  multiple  users’  incoming  emai0 
and  trains  on  everyone’s  received  email.  SpamBayes  may  also  be  used  as  a  personal  email 
hlter,  in  which  case  the  presented  attacks  and  defenses  are  likely  to  be  equally  ehective. 

To  keep  up  with  changing  trends  in  the  statistical  characteristics  of  both  legitimate 
and  spam  email,  we  assume  that  the  organization  retrains  SpamBayes  periodically  {e.g., 
weekly).  Our  attacks  are  not  limited  to  any  particular  retraining  process;  they  only  require 
that  the  attacker  can  introduce  attack  data  into  the  training  set  somehow  (the  contamina¬ 
tion  assumption).  In  the  next  section,  we  justify  this  assumption  but  the  purpose  of  this 
investigation  is  only  to  analyze  the  ehect  of  poisoned  datasets. 


2.2. 1.2  SpamBayes  Learning  Method 

SpamBayes  makes  classihcations  using  token  scores  based  on  a  simple  model  of  spam 


1948). 


status  proposed  by  Meyer  and  Whateley  (2004);  Robinson  (2003),  based  on  ideas  by|Graham| 
(2002)  together  with  Fisher’s  method  for  combining  independent  significance  tests  (iFisherl 


SpamBayes  tokenizes  the  header  and  body  of  each  email  before  constructing  token  spam 
scores.  Robinson’s  method  assumes  that  the  presence  or  absence  of  tokens  in  an  email  affect 
its  spam  status  independently.  For  each  token  w,  the  raw  token  spam  score 


{S,w) 


NhNs{w) 


NhNs{w)  +  NsNh{w) 


is  computed  from  the  counts  Ns,  Nh,  Ns{w),  and  Nh{w) — the  number  of  spam  emails,  ham 
emails,  spam  emails  that  include  w  and  ham  emails  that  include  w. 

Robinson  smooths  P(s,,j,)  through  a  convex  combination  with  a  prior  belief  x,  weighting 
the  quantities  by  N{w)  (the  number  of  training  emails  with  w)  and  s  (chosen  for  strength 
of  prior),  respectively: 


/H 


s  N{w) 

s  +  N{w)  ^ s  +  N{w) 


(2.1) 


For  a  new  email  message  E,  Robinson  uses  Fisher’s  method  to  combine  the  spam  scores 
of  the  most  signihcant  token^  into  a  message  score 

^We  use  the  terms  user  and  victim  interchangeably  for  either  organization  or  individual;  the  meaning 
will  be  clear  from  context. 

^SpamBayes  uses  at  most  150  tokens  from  E  with  scores  furthest  from  0.5  and  outside  the  interval 
[0.4, 0.6].  We  call  this  set  6{E). 
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I{E) 

where  H{E) 


1  +  H{E)  -  S{E) 


e  [0,1]  , 


(2.2) 


1  -xL  (  -2 

w^S{E) 


W) 


S(E)  =  1-xl 


-2  ^  log  (1- /(to)) 

w£6{E) 


and  where  xln  (')  denotes  the  cnmnlative  distribntion  fnnction  of  the  chi-sqnare  distribntion 
with  2n  degrees  of  freedom.  SpamBayes  predicts  by  thresholding  against  two  nser-tnnable 
thresholds  and  9i,  with  defanlts  6*o  =  0.15  and  6*i  =  0.9:  SpamBayes  predicts  ham,  unsure, 
or  spam  if  I  falls  into  the  interval  [0,6*o],  (6*o,6*i],  or  (^i,  1],  respectively. 

The  inclusion  of  an  unsure  category  prevents  us  from  purely  using  misclassification  rates 
(false  positives  and  false  negatives)  for  evaluation.  We  must  also  consider  spam-as-unsure 
and  ham-as-unsure  emails.  Because  of  the  considerations  in  Section  12.2.1.11  unsure  misclas- 


sihcations  of  ham  emails  are  nearly  as  bad  for  the  user  as  false  positives. 


2.2.2  Attacks 


We  now  present  Causative  Availability  attacks  on  SpamBayes,  i.e.,  attacks  that  aim 
to  increase  the  False  Positive  Rate  of  the  learned  classifier  by  manipulating  the  training 
data.  After  describing  the  contamination  assumption  that  we  can  realistically  inject  spam 
messages  into  the  training  corpus  (Section  |2.2.2.1 ),  we  detail  both  Indiscriminate  (Sec¬ 
tion 


2. 2. 2. 2)  and  Targeted  (Section  2. 2. 2. 3)  attacks. 


2. 2. 2.1  The  Contamination  Assumption 

We  assume  that  the  attacker  can  send  emails  that  the  victim  will  use  for  training — 
the  contamination  assumption — but  incorporate  two  signihcant  restrictions:  attackers  may 
specify  arbitrary  email  bodies  but  not  headers,  and  attack  emails  are  always  trained  as 
spam  and  not  ham.  We  examine  the  implications  of  the  contamination  assumption  in  the 
remainder  of  this  section. 

How  can  an  attacker  contaminate  the  training  set?  Consider  the  following  alternatives. 
If  the  victim  periodically  retrains  on  all  email,  any  email  the  attacker  sends  will  be  used  for 
training.  If  the  victim  manually  labels  a  training  set,  the  attack  emails  will  be  included  as 
spam  because  they  genuinely  are  spam.  Even  if  the  victim  retrains  only  on  mistakes  made 
by  the  filter,  the  attacker  may  be  able  to  design  emails  that  both  perform  our  attacks  and 
are  also  misclassihed  by  the  victim’s  current  filter.  We  do  not  address  the  possibility  that  a 
user  might  inspect  training  data  to  remove  attack  emails;  our  attacks  could  be  adjusted  to 
evade  detection  strategies  such  as  email  size  or  word  distributions,  but  we  avoid  pursuing 
this  arms  race  here. 
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Our  focus  on  spam-labeled  attack  emails  should  be  viewed  as  a  restriction  and  not  a 
necessary  condition  for  the  success  of  the  attacks — using  ham-labeled  attack  emails  could 
enable  more  powerful  attacks  that  place  spam  in  a  user’s  inbox  (Barreno,  2008;  Saini,  2008). 


2. 2. 2. 2  Dictionary  Attacks 

Our  first  attack  is  an  Indiscriminate  attack.  The  idea  behind  the  attack  is  to  send  attack 
emails  that  contain  many  words  likely  to  occur  in  legitimate  email.  When  the  victim  trains 
SpamBayes  with  these  attack  emails  marked  as  spam,  the  spam  scores  of  the  words  in  the 
attack  emails  will  increase.  Future  legitimate  email  is  more  likely  to  be  marked  as  spam 
if  it  contains  words  from  the  attack  email.  With  a  sufficient  increase  to  the  False  Positive 
Rate,  the  victim  will  disable  the  spam  hlter,  or  at  least  must  frequently  search  through 
spam/unsure  folders  to  hnd  legitimate  messages  that  were  filtered  away.  In  either  case,  the 
victim  loses  confidence  in  the  hlter  and  is  forced  to  view  more  spam — the  victim  sees  the 
attacker’s  spam. 

Depending  on  the  level  of  information  available  to  the  adversary,  s/he  may  be  able  to 
construct  more  effective  attacks  on  the  spam  hlter. 


Knowledge  of  Victim’s  Language.  When  the  attacker  lacks  knowledge  about  the  vic¬ 
tim’s  email,  one  simple  attack  is  to  include  an  entire  dictionary  of  the  English  language 
(or  more  generally  a  dictionary  of  the  victim’s  native  tongue).  This  technique  is  the  basic 
dictionary  attack.  We  use  the  GNU  aspell  English  dictionary  version  6.0-0,  containing 
98,568  words. 

The  dictionary  attack  increases  the  score  of  every  token  in  a  dictionary  of  English  words 
he.,  it  makes  them  more  indicative  of  spam).  After  it  receives  a  dictionary  spam  message, 
the  victim’s  spam  hlter  will  have  a  higher  spam  score  for  every  token  in  the  dictionary,  an 
ehect  that  is  amplihed  for  less  frequent  tokens:  in  particular,  the  spam  scores  of  vulnerable 
tokens  dramatically  increases.  Furthermore,  the  long-tailed  Zipf  distribution  of  natural 
human  language  implies  that  a  victim’s  future  non-spam  email  will  likely  contain  several 
vulnerable  tokens,  increasing  the  hlter’s  spam  score  for  that  email. 

(Limited)  Knowledge  of  Victim’s  Word  Distribution.  A  further  rehnement  uses 
a  word  source  with  distribution  closer  to  the  victim’s  email  distribution.  For  example,  a 
large  pool  of  Usenet  newsgroup  postings  may  have  colloquialisms,  misspellings,  and  other 
“words”  not  found  in  a  dictionary;  furthermore,  using  the  most  frequent  words  in  such  a 
corpus  may  allow  the  attacker  to  send  smaller  emails  without  losing  much  ehectiveness. 

This  attack  exploits  the  sparsity  of  tokens  in  human  text  (he.,  most  people  use  small 
vocabularies).  As  mentioned  above,  in  natural  language  there  are  a  small  number  of  words 
that  are  used  frequently  and  a  large  number  of  words  (a  long  tail)  that  are  used  infrequently. 

2. 2. 2. 3  Focused  Attack 

Our  second  kind  of  attack — the  focused  attack — assumes  knowledge  of  a  specihc  legiti¬ 
mate  email  or  type  of  email  the  attacker  wants  blocked  by  the  victim’s  spam  hlter.  This  is 
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a  Causative  Targeted  Availability  attack.  In  the  focused  attack,  the  attacker  sends  attack 
emails  to  the  victim  containing  words  likely  to  occur  in  the  target  email.  When  SpamBayes 
trains  on  this  attack  email,  the  spam  scores  of  the  targeted  tokens  increase,  so  the  target 
message  is  more  likely  to  be  filtered  as  spam. 

For  example,  consider  a  malicious  contractor  wishing  to  prevent  the  victim  from  receiving 
messages  with  competing  bids.  The  attacker  sends  spam  emails  to  the  victim  with  words 
such  as  the  names  of  competing  companies,  their  products,  and  their  employees.  The  bid 
messages  may  even  follow  a  common  template  known  to  the  attacker,  making  the  attack 
easier  to  craft. 

The  attacker  may  have  different  levels  of  knowledge  about  the  target  email.  In  the 
extreme  case,  the  attacker  might  know  the  exact  content  of  the  target  email  and  use  all 
of  its  words.  More  realistically,  the  attacker  only  guesses  a  fraction  of  the  email’s  content. 
In  either  case,  the  attack  email  may  include  additional  words  as  well,  e.g.,  drawn  from  a 
general  distribution  over  email  text  to  obfuscate  the  message’s  intent. 

Like  the  Usenet  distribution-based  attack,  the  focused  attack  is  more  concise  than  the 
dictionary  attack  because  the  attacker  has  detailed  knowledge  of  the  target  and  need  not 
affect  other  messages.  The  focused  attack  can  be  more  concise  because  it  leaves  out  words 
that  are  unlikely  to  appear.  This  conciseness  makes  the  attack  both  more  efficient  for  the 
attacker  and  more  difficult  to  detect  as  an  attack. 


2. 2. 2. 4  A  Principled  Justification  of  the  Dictionary  and  Focused  Attacks 


The  dictionary  and  focused  attacks  can  be  seen  as  two  instances  of  a  common  attack 
in  which  the  attacker  has  different  levels  of  information  about  the  victim’s  email.  Without 
loss  of  generality,  suppose  the  attacker  generates  only  a  single  attack  message  a.  The  victim 
adds  it  to  the  training  set  as  spam,  trains,  and  classifies  a  (random)  new  text  message  M 
in  the  future.  Since  SpamBayes  operates  under  a  (typical)  bag-of-words  model,  both  a  and 
M  are  indicator  vectors,  where  the  component  is  true  iff  word  i  appears  in  the  email. 
The  attacker  also  has  some  (perhaps  limited)  knowledge  of  the  next  email  the  victim  will 
receive.  This  knowledge  can  be  represented  as  a  distribution  D — the  vector  of  probabilities 
that  each  word  appears  in  the  next  message.  That  is,  the  attacker  assumes  that  M  ~  D, 
and  has  reason  to  believe  that  D  is  related  to  the  true  underlying  email  distribution  of  the 
victim. 

The  goal  of  the  attacker  is  to  choose  an  attack  email  a  that  maximizes  the  expected  spam 
score  la  (Equation  2.2  with  the  attack  message  a  including  in  the  training  corpus)  of  the 


next  legitimate  email  M  drawn  from  distribution  D]  that  is,  the  attacker’s  goal  is  to  select 
an  attack  message  in 


argmax  Em~d  [/a(M)]  . 

a 

In  order  to  describe  the  optimal  attacks  under  this  criterion,  we  make  two  observations. 
First,  the  spam  scores  of  distinct  words  do  not  interact;  that  is,  adding  a  word  w  to  the 
attack  does  not  change  the  score  f{u)  of  some  different  word  u  ^  w.  Second,  it  is  easy 
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to  show  that  I  is  non-decreasing  in  each  f{w).  Therefore  the  best  way  to  increase  I  a  is  to 
include  additional  words  in  the  attack  message. 

Now  let  us  consider  specific  choices  for  the  next  email’s  distribution  D.  First,  if  the 
attacker  has  little  knowledge  about  the  words  in  target  emails,  the  attacker  can  set  D  to  be 
uniform  over  all  emails.  We  can  optimize  the  resulting  expected  spam  score  by  including 
all  possible  words  in  the  attack  email.  This  optimal  attack  is  infeasible  in  practice  (as  it 
includes  misspellings,  etc.)  but  can  be  approximated:  one  approximation  includes  all  words 
in  the  victim’s  primary  language,  such  as  an  English  dictionary.  This  yields  the  dictionary 
attack. 

Second,  if  the  attacker  has  specific  knowledge  of  a  target  email,  we  can  represent  this 
by  setting  D{i)  to  1  iff  the  word  is  in  the  target  email.  The  above  ‘optimal  attack’  still 
maximizes  the  expected  spam  score,  but  a  more  compact  attack  that  also  optimizes  the 
expected  spam  score  is  to  include  all  of  the  words  in  the  target  email.  This  produces  the 
focused  attack. 

The  attacker’s  knowledge  usually  falls  between  these  extremes.  For  example,  the  attacker 
may  use  information  about  the  distribution  of  words  in  English  text  to  make  the  attack  more 
efficient,  such  as  characteristic  vocabulary  or  jargon  typical  for  the  victim.  Either  way,  the 
adversary’s  information  results  in  a  distribution  D  over  words  in  the  victim’s  emails. 


2.2.3  Attack  Results 

We  now  present  experiments  launching  the  attacks  described  above  on  the  SpamBayes 
email  spam  hlter. 


2. 2. 3.1  Experimental  Method 

Dataset.  In  our  experiments  we  use  the  Text  Retrieval  Conference  (TREC)  2005  spam 


corpus  (Cormack  and  Lynam,  2005),  which  is  based  on  the  Enron  email  corpus  (Klimt  and 


Yang,  2004)  and  contains  92,189  emails  (52,790  spam  and  39,399  ham).  This  corpus  has 


several  strengths:  it  comes  from  a  real-world  source,  it  has  a  large  number  of  emails,  and 
its  creators  took  care  that  the  added  spam  does  not  have  obvious  artifacts  to  differentiate 
it.  We  also  use  a  corpus  constructed  from  a  subset  of  Usenet  English  postings  to  generate 
words  for  our  attacks  (Shaoul  and  Westbury,  2007). 


Training  Method.  For  each  experiment,  we  sample  a  dataset  of  email  without  replace¬ 
ment  from  the  TREC  corpus.  We  create  a  control  model  by  training  SpamBayes  once 
only  on  the  training  set.  Each  of  our  attacks  creates  a  different  type  of  attack  email  for 
SpamBayes  to  use  in  training,  producing  tainted  models. 

When  we  require  mailboxes  of  a  specified  size,  such  as  for  training  and  test  sets,  we 
sample  ham  and  spam  emails  randomly  without  replacement  from  the  entire  TREC  spam 
corpus.  When  we  require  only  a  portion  of  an  email,  such  as  the  header,  we  randomly  select 
an  email  from  the  dataset  that  has  not  been  used  in  the  current  run,  so  that  we  ignore  email 
messages  that  have  already  been  selected  for  use  in  the  training  set. 
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Parameter 

Dictionary 

Attack 

Focused 

Attack 

RONI 

Defense 

Threshold 

Defense 

Training  set  size 

2,000,  10,000 

5,000 

20 

2,000,  10,000 

Test  set  size 

200,  1,000 

N/A 

50 

200,  1,000 

Spam  preva¬ 

lence 

0.50,  0.75 

0.50 

0.50 

0.50 

Attack  fraction 

0.001,  0.005, 
0.01,  0.02, 
0.05,  0.10 

0.02  to  0.50 
incrementing 
by  0.02 

0.05 

0.001,  0.01, 
0.05,  0.10 

Folds  of  valida¬ 
tion 

10 

5  repetitions 

5  repetitions 

5 

Target  Emails 

N/A 

20 

N/A 

N/A 

Table  2.1:  Parameters  used  in  our  experiments. 


Message  Generation.  We  restrict  the  attacker  to  have  limited  control  over  the  headers 
of  attack  emails  (see  Section  2. 2. 2.1).  We  implement  this  assumption  either  by  using  the 
entire  header  from  a  randomly  selected  spam  email  from  TREC  (focused  attack)  or  by  using 
an  empty  header  (all  other  attacks). 


Method  of  Assessment.  We  measure  the  effect  of  each  attack  by  comparing  classihcation 
performance  of  the  control  and  compromised  hlters  using  iP-fold  cross-validation  (or  K 
repetitions  with  new  random  dataset  samples  in  the  case  of  the  focused  attack).  In  cross- 
validation,  we  partition  the  dataset  into  K  subsets  and  perform  K  train-test  epochs.  During 
the  epoch,  the  subset  is  set  aside  as  a  test  set  and  the  remaining  [K  —  1)  subsets 
are  used  for  training.  Each  email  from  our  original  dataset  serves  independently  as  both 
training  and  test  data. 

In  the  following  sections,  we  show  the  effect  of  our  attacks  on  test  sets  of  held-out 
messages.  Because  our  attacks  are  designed  to  cause  ham  to  be  misclassihed,  we  only  show 
their  effect  on  ham  messages;  their  effect  on  spam  is  marginal.  Our  graphs  do  not  include 
error  bars  since  we  observed  that  the  variation  in  our  tests  was  small.  See  Table  [2T]  for  our 
experimental  parameters. 


2. 2. 3. 2  Dictionary  Attack  Results 

We  examined  the  effect  of  adversarial  control  on  the  effectiveness  of  dictionary  attacks. 
Here  adversarial  control  is  parametrized  as  the  percent  of  attack  messages  in  the  training  set. 
F igure  [2T] shows  the  misclassihcation  rates  of  three  dictionary  attack  variants  averaging  over 
ten-fold  cross-validation.  The  optimal  attack  quickly  causes  the  hlter  to  label  all  ham  emails 
as  spam.  The  Usenet  dictionary  attack  (90,000  top  ranked  words  from  the  Usenet  corpus) 
causes  signihcantly  more  ham  emails  to  be  misclassihed  than  the  Aspell  dictionary  attack, 
since  it  contains  common  misspellings  and  slang  terms  that  are  not  present  in  the  Aspell 
dictionary  (the  overlap  between  the  Aspell  and  Usenet  dictionaries  is  around  61,000  words). 
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Optimal  — B—  Usenet  — ®-  Aspell 


Figure  2.1:  Three  dictionary  attacks  on  initial  training  set  of  10,000  messages  (50%  spam). 
We  plot  percent  of  ham  classihed  as  spam  (dashed  lines)  and  as  spam  or  unsure  (solid  lines) 
against  the  attack  as  percent  of  the  training  set.  We  show  the  optimal  attack  (black  /\), 
the  Usenet  dictionary  attack  (blue  □),  and  the  Aspell  dictionary  attack  (green  o)-  Each 
attack  renders  the  filter  unusable  with  as  little  as  1%  control  (101  messages). 


These  variations  of  the  attack  require  relatively  few  attack  emails  to  signihcantly  degrade 
SpamBayes  accuracy.  By  101  attack  emails  (1%  of  10,000),  the  accuracy  falls  significantly 
for  each  attack  variation;  at  this  point  most  users  will  gain  no  advantage  from  continued 
use  of  the  filter. 

Remark  7.  Although  the  attack  emails  make  up  a  small  percentage  of  the  number  of  mes¬ 
sages  in  a  poisoned  inbox,  they  make  up  a  large  percentage  of  the  number  of  tokens.  For 
example,  at  204  attack  emails  (2%  of  the  messages),  the  Usenet  attack  includes  approxi¬ 
mately  6.4  times  as  many  tokens  as  the  original  dataset  and  the  Aspell  attack  includes  7 
times  as  many  tokens.  An  attack  with  fewer  tokens  would  likely  he  harder  to  detect;  however, 
the  number  of  messages  is  a  more  visible  feature.  It  is  of  significant  interest  that  such  a 
small  number  of  attack  messages  is  sufficient  to  degrade  the  performance  of  a  widely -deployed 
filtering  algorithm  to  such  a  degree. 


2. 2. 3. 3  Focused  Attack  Results 

In  this  section,  we  experimentally  analyze  how  many  attack  emails  are  required  for  the 
focused  attack  to  be  effective,  how  accurate  the  attacker  needs  to  be  at  guessing  the  target 
email,  and  whether  some  emails  are  easier  to  target  than  others. 
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Probability  of  guessing  target  tokens 


Figure  2.2:  Effect  of  the  focused  attack 
as  a  function  of  adversarial  information. 
Each  bar  depicts  the  fraction  of  target 
emails  classihed  as  spam,  ham,  and  un¬ 
sure  after  the  attack.  The  initial  inbox 
contains  5,000  emails  (with  50%  spam). 


Figure  2.3:  Effect  of  the  focused  attack  as  a 
function  of  adversarial  control  (with  adversar¬ 
ial  information  at  p=0.5)  The  dashed  (solid)  line 
shows  the  percentage  of  targets  misclassified  as 
spam  (unsure  or  spam)  after  the  attack.  The  ini¬ 
tial  inbox  contains  5,000  emails  (50%  spam). 


We  run  each  repetition  of  the  focused  attack  as  follows.  First  we  randomly  select  a 
ham  email  from  the  TREC  corpus  to  serve  as  the  target  of  the  attack.  We  use  a  clean, 
non-malicious  5,000-message  inbox  with  50%  spam.  We  repeat  the  entire  attack  procedure 
independently  for  20  randomly-selected  target  emails. 

In  Figure  [2^  we  examine  the  effectiveness  of  the  attack  when  the  attacker  has  increasing 
knowledge  of  the  target  email  by  simulating  the  process  of  the  attacker  guessing  tokens  from 
the  target  email.  For  this  hgure,  there  are  300  attack  emails — 16%  of  the  original  number 
of  training  emails.  We  assume  that  the  attacker  correctly  guesses  each  word  in  the  target 
with  probability  p  in  {0.1, 0.3,  0.5,  0.9} — the  a:-axis  of  Figure  2.2  The  |/-axis  shows  the 


proportion  of  the  20  targets  classified  as  ham,  unsure  and  spam.  As  expected,  the  attack  is 
increasingly  effective  as  the  level  of  adversarial  information  p  increases.  With  knowledge  of 
only  30%  of  the  tokens  in  the  target,  60%  of  the  target  emails  are  mis- elas sifted. 

In  Figure  2.3,  we  examine  the  attack’s  effect  on  misclassifications  of  the  targeted  emails 
as  the  number  of  attack  messages — the  adversarial  control — increases.  Here  we  £x  the 
probability  of  guessing  each  target  token  at  0.5.  The  x-axis  depicts  the  number  of  messages 
in  the  attack  and  the  r/-axis  is  the  percent  of  messages  misclassified.  With  1 00  attack  emails, 
out  of  a  initial  mailbox  size  of  5, 000,  the  target  email  is  misclassified  32%  of  the  time. 

Additional  insight  can  be  gained  by  examining  the  attack’s  effect  on  three  representative 
emails  (see  Figure  2.4).  Each  of  the  panels  in  the  hgure  represents  a  single  target  email 
representing  each  of  three  possible  attack  outcomes:  ham  misclassihed  as  spam  (Left),  ham 
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Token  score  before  attack 


Token  score  before  attack 


Figure  2.4:  Effect  of  the  focused  attack  on  three  representative  emails — one  graph  for  each 
target  email.  Each  point  is  a  token  in  the  target  email.  The  x-axis  is  the  token  spam  score  in 
Equation  (2.1)  before  the  attack  (0  means  ham  and  1  means  spam).  The  y-axis  is  the  spam 
score  after  the  attack.  The  red  crosses  are  tokens  that  were  included  in  the  attack  and  the 
black  circles  are  tokens  that  were  not  in  the  attack.  The  histograms  show  the  distribution 
of  spam  scores  before  the  attack  (at  bottom)  and  after  the  attack  (at  right). 


misclassihed  as  unsure  (Middle),  and  ham  correctly  classihed  as  ham  (Right).  Each  point 
in  the  graph  represents  the  before/after  score  of  a  token;  any  point  above  the  line  y  = 
X  corresponds  to  a  token  score  increase  due  to  the  attack  and  any  point  below  the  line 
corresponds  to  a  decrease.  From  these  graphs  it  is  clear  that  tokens  included  in  the  attack 
typically  increase  signihcantly  while  those  not  included  decrease  slightly.  Since  the  increase 
in  score  is  more  signihcant  for  included  tokens  than  the  decrease  in  score  for  excluded  tokens, 
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the  attack  has  substantial  impact  even  when  the  attacker  has  a  low  probability  of  guessing 
tokens  as  seen  in  Figure  [2^  Furthermore,  the  before/after  histograms  in  F igure  [2~4| provide 
a  direct  indicator  of  the  attack’s  success. 

Also,  comparing  the  bottom  histograms  of  the  three  panels,  we  can  see  that  the  attack 
was  most  successful  on  emails  that  already  contained  a  signihcant  number  of  spam  tokens 
before  the  attack.  All  three  emails  as  a  whole  were  conhdently  classihed  as  ham  before  the 
attack,  but  in  the  successful  attack,  the  target  was  closest  to  being  classihed  as  spam. 


2.2.4  Defenses 

In  the  following  two  sections  we  consider  counter-measures  for  responding  against  the 
attacks  presented  above. 


2. 2. 4.1  RONI  defense 


Our  Causative  attacks  are  effective  since  training  on  attack  emails  causes  the  hlter  to 
learn  incorrect  token  spam  scores  and  misclassify  emails.  Each  attack  email  contributes 
towards  the  degradation  of  the  hlter’s  performance;  if  we  can  measure  each  email’s  impact 
prior  to  training,  then  we  can  remove  deleterious  messages  from  the  training  set  before  the 
classiher  is  manipulated. 

In  the  Reject  On  Negative  Impact  (RONI)  defense,  we  measure  the  incremental  effect  of 
each  query  email  Q  by  testing  the  performance  difference  with  and  without  that  email.  We 
independently  sample  a  20-message  training  set  T  and  a  50-message  validation  set  V  hve 
times  from  the  initial  pool  of  emails  given  to  SpamBayes  for  training.  We  train  on  both  T 
and  T  U  {Q}  and  measure  the  impact  of  each  query  email  as  the  average  change  in  incorrect 
classihcations  on  V  over  the  hve  trials.  We  reject  candidate  message  Q  from  training  if  the 
impact  is  signihcantly  negative.  We  test  with  120  random  non-attack  spam  messages  and 
15  repetitions  each  of  seven  variants  of  the  dictionary  attacks  in  Section  2.2.2.2[ 

Experiments  show  that  the  RONI  defense  is  extremely  successful  against  dictionary  at¬ 
tacks,  identifying  100%  of  the  attack  emails  without  flagging  any  non-attack  emails.  Each 
dictionary  attack  message  causes  at  least  an  average  decrease  of  6.8  ham-as-ham  messages. 
In  sharp  contrast,  non-attack  spam  messages  cause  at  most  an  average  decrease  of  4.4  ham- 
as-ham  messages.  This  clear  region  of  separability  means  a  simple  threshold  on  this  statistic 
is  ehective  at  separating  dictionary  attack  emails  from  non-attack  spam. 

This  experiment  provides  some  conhdence  that  this  defense  would  work  given  a  larger 
test  set  due  to  the  large  impact  a  small  number  of  attack  emails  have  on  performance. 

However,  the  RONI  defense  fails  to  diherentiate  focused  attack  emails  from  non-attack 
emails.  The  explanation  is  simple:  the  dictionary  attack  broadly  ahects  emails,  including 
training  emails,  while  the  focused  attack  is  targeted  at  a  future  email,  so  its  ehects  may  not 
be  evident  on  the  training  set  alone. 
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Figure  2.5:  Effect  of  the  threshold  defense  on  the  classihcation  of  ham  messages  with  the 
dictionary  based  attacks.  We  use  a  10,  000  inbox  training  set  of  which  50%  are  spam.  The 
solid  lines  represent  ham  messages  classihed  as  spam  or  unsure  while  the  dashed  lines  show 
the  classihcation  rate  of  ham  messages  as  spam.  Threshold-. 05  has  a  wider  range  for  unsure 
messages  than  the  Threshold-. 10  variation. 


2. 2. 4. 2  Dynamic  Threshold  Defense 


Distribution-based  attacks  increase  the  spam  score  of  ham  email  but  they  also  tend  to 
increase  the  spam  score  of  spam.  Thus  with  new  6q,  6i  thresholds,  it  may  still  be  possible 
to  accurately  distinguish  between  these  kinds  of  messages  after  an  attack.  Based  on  this 
hypothesis,  we  propose  and  test  a  dynamic  threshold  defense,  which  dynamically  adjusts 
6o,  6i.  With  an  adaptive  threshold  scheme,  attacks  that  shift  all  scores  will  not  be  effective 
since  rankings  are  invariant  to  such  shifts. 

To  determine  dynamic  values  of  6o  and  6i,  we  split  the  full  training  set  in  half.  We 
use  one  half  to  train  a  SpamBayes  hlter  F  and  the  other  half  as  a  validation  set  V.  Using 
F,  we  obtain  a  score  for  each  email  in  V.  From  this  information,  we  can  pick  threshold 
values  that  more  accurately  separate  ham  and  spam  emails.  We  dehne  a  utility  function  for 
choosing  threshold  t,  g(t)  =  Ns,<(t)  {Ns^<{t)  +  ,  where  Ns^<{t)  is  the  number  of 

spam  emails  with  scores  less  than  t  and  NH,>{t)  is  the  number  of  ham  emails  with  scores 
greater  than  t.  We  select  9q  so  that  g{9o)  is  0.05  or  0.10,  and  we  select  9i  so  that  g{9i)  is 
0.95  or  0.90,  respectively. 

This  is  a  promising  defense  against  dictionary  attacks.  As  shown  in  Figure  |2.5  the 
misclassihcation  of  ham  emails  is  signihcantly  reduced  by  using  the  defense.  At  all  stages  of 
the  attack,  ham  emails  are  never  classihed  as  spam  and  only  a  moderate  amount  of  them  are 
labeled  as  unsure.  However,  while  ham  messages  are  often  classihed  properly,  the  dynamic 
threshold  causes  almost  all  spam  messages  to  be  classihed  as  unsure  even  when  the  attack 
is  only  1%  of  the  inbox.  This  shows  that  the  dynamic  threshold  defense  fails  to  adequately 
separate  ham  and  spam  given  the  number  of  spam  also  classihed  as  unsure. 
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2.3  Case-Study  on  Network  Anomaly  Detection 

In  this  section  we  study  both  poisoning  strategies  and  defenses  in  the  context  of  the 


PCA-subspace  method  for  network-wide  anomaly  detection  (Lakhina  et  ah,  2004a),  based 


on  Principal  Component  Analysis  (PCA).  This  technique  has  received  a  large  amount  of 
attention,  leading  to  extensions  (Lakhina  et  ah,  2004b,  2005a|b),  and  inspiring  related  re¬ 
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).  Additionally,  a  few  companies  are  exploring  the  use  of  PCA-based 

techniques  and  related  SVD  algorithms  in  their  products  (Guavus,  2010;  Narus,  2010). 


We  consider  an  adversary  who  knows  that  an  ISP  is  using  a  PCA-based  anomaly  detector. 
The  adversary’s  aim  is  to  evade  future  detection  by  poisoning  the  training  data  so  that  the 
detector  learns  a  distorted  set  of  principal  components.  Because  PCA  solely  focuses  on  link 
traffic  covariance,  we  explore  poisoning  schemes  that  add  chaff  (additional  traffic)  into  the 
network  to  increase  the  variance  of  the  network’s  traffic.  The  end  goal  of  the  attacker  is  to 
increase  the  false  negative  rate  of  the  detector,  which  corresponds  to  the  attacker’s  evasion 
success  rate.  That  is,  we  consider  Causative  Integrity  attacks  on  PCA-based  network-wide 
anomaly  detection. 

The  first  contribution  of  this  section  is  a  detailed  analysis  of  how  adversaries  subvert  the 
learning  process  for  the  purposes  of  subsequent  evasion.  We  explore  a  range  of  poisoning 
strategies  in  which  the  attacker’s  knowledge  about  the  network  traffic  state  is  varied,  and 
in  which  the  attacker’s  time  horizon  (length  of  poisoning  episode)  is  varied. 

Because  the  network  data  on  which  SML  techniques  are  applied  are  non-stationary,  the 
baseline  models  must  be  periodically  retrained  to  capture  evolving  trends  in  the  underlying 
data.  In  previous  usage  scenarios  (Lakhina  et  ah,  2004a  Soule  et  ah,  2005),  the  PCA 


detector  is  retrained  regularly,  for  example  weekly.  A  consequence  is  that  attackers  could 
poison  PCA  slowly  over  long  periods  of  time — poisoning  PCA  in  a  more  stealthy  fashion. 
By  perturbing  the  principal  components  gradually,  the  attacker  decreases  the  chance  that 
the  poisoning  activity  itself  is  detected.  We  design  such  a  poisoning  scheme,  called  a  Boiling 
Frog  scheme,  and  demonstrate  that  it  can  boost  the  false  negative  rate  as  high  as  the  non- 
stealthy  strategies,  with  far  less  chaff,  albeit  over  a  longer  period  of  time. 

Our  second  main  contribution  is  to  design  a  robust  defense  against  this  type  of  poison¬ 
ing.  It  is  known  that  PCA  can  be  strongly  affected  by  outliers  (Ringberg  et  ah,  2007). 


However,  instead  of  selecting  principal  components  as  directions  that  maximize  variance, 
robust  statistics  suggests  components  that  maximize  more  robust  measures  of  dispersion.  It 
is  well  known  that  the  median  is  a  more  robust  measure  of  location  than  the  mean,  in  that 
it  is  far  less  sensitive  to  the  influence  of  outliers.  This  concept  can  be  extended  to  robust 
alternatives  to  variance  such  as  the  Median  Absolute  Deviation  (MAD).  Over  the  past  two 
decades  a  number  of  robust  PCA  algorithms  have  been  developed  that  maximize  MAD  in¬ 
stead  of  variance.  Recently  the  PCA-grid  algorithm  was  proposed  as  an  efficient  method 
for  maximizing  MAD  without  under-estimating  variance  (a  flaw  identified  in  previous  solu¬ 


tions)  (Croux  et  ah,  2007).  We  adapt  PCA-grid  for  anomaly  detection  by  combining  the 
method  with  a  new  robust  cutoff  threshold.  Instead  of  modeling  the  squared  prediction  er¬ 
ror  as  Gaussian  (as  in  the  original  PCA-based  detection  method),  we  model  the  error  using 
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a  Laplace  distribution.  This  new  threshold  is  motivated  from  observing  that  the  residuals 
have  longer  tails  than  modeled  by  a  Gaussian.  We  call  our  method  that  combines  PCA- 
GRID  with  a  Laplace  cutoff  threshold,  antidote.  The  key  intuition  behind  this  method  is 
to  reduce  the  effect  of  outliers  and  help  reject  poisonous  training  data. 

Our  third  contribution  is  to  carry  out  extensive  evaluations  of  both  antidote  and  the 
original  PGA  method,  in  a  variety  of  poisoning  situations,  and  to  assess  their  performance 
via  multiple  metrics.  To  do  this,  we  used  traffic  matrix  data  from  the  Abilene  network 
since  many  other  studies  of  traffic  matrix  estimation  and  anomaly  detection  have  used 
this  data.  We  show  that  the  original  PGA  method  can  be  easily  compromised  by  any  of 
our  poisoning  schemes,  with  only  small  amounts  of  chaff.  For  moderate  amounts  of  chaff, 
the  PGA  detector  starts  to  approach  the  performance  of  a  random  detector.  However 
ANTIDOTE  is  dramatically  more  robust.  It  outperforms  PGA  in  that  i)  it  more  effectively 
limits  the  adversary’s  ability  to  increase  his  evasion  success;  ii)  it  can  reject  a  larger  portion 
of  contaminated  training  data;  and  iii)  it  provides  robust  protection  across  nearly  all  origin- 
destination  flows  through  a  network.  The  gains  of  ANTIDOTE  for  these  performance  measures 
are  large,  especially  as  the  amount  of  poisoning  increases.  Most  importantly,  we  demonstrate 
that  ANTIDOTE  incurs  insignificant  shifts  in  its  false  negative  and  false  positive  performance, 
compared  to  PGA,  in  the  absence  of  poisoning. 

Our  study  sheds  light  on  the  general  problem  of  poisoning  SML  techniques,  in  terms 
of  the  types  of  poisoning  schemes  that  can  be  construed,  their  impact  on  detection,  and 
strategies  for  defense. 


2.3.1  Background 

To  uncover  anomalies,  many  network  detection  techniques  mine  the  network-wide  traffic 
matrix,  which  describes  the  traffic  volume  between  all  pairs  of  Points-of-Presence  (PoP)  in 
a  backbone  network  and  contains  the  collected  traffic  volume  time  series  for  each  origin- 
destination  (OD)  flow.  In  this  section,  we  define  traffic  matrices,  present  our  notation,  and 
summarize  the  PGA  anomaly  detection  method  of  Lakhina  et  ah  (2004a). 


2. 3. 1.1  Traffic  Matrices  and  Volume  Anomalies 

Network  link  traffic  represents  the  superposition  of  OD  flows.  We  consider  a  network 
with  N  links  and  F  OD  flows  and  measure  traffic  on  this  network  over  T  time  intervals. 
The  relationship  between  link  traffic  and  OD  flow  traffic  is  concisely  captured  in  the  routing 
matrix  A.  This  matrix  is  an  V  x  F  matrix  such  that  Aij  =  1  if  OD  flow  j  passes  over  link 
i,  and  is  zero  otherwise.  If  X  is  the  T  x  F  traffic  matrix  (TM)  containing  the  time-series 
of  all  OD  flows,  and  if  Y  is  the  T  x  N  link  TM  containing  the  time-series  of  traffic  on  all 
links,  then  Y  =  XA^.  We  denote  the  row  of  Y  as  y{t)  =  Yt,,  (the  vector  of  N  link 
traffic  measurements  at  time  t),  and  the  original  traffic  along  a  source  link,  S  by  ys(t).  We 
denote  column  /  of  routing  matrix  A  by  Af. 

We  consider  the  problem  of  detecting  OD  flow  volume  anomalies  across  a  top-tier  net¬ 
work  by  observing  link  traffic  volumes.  Anomalous  flow  volumes  are  unusual  traffic  load 
levels  in  a  network  caused  by  anomalies  such  as  Denial  of  Service  (DoS)  attacks.  Distributed 
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DoS  attacks,  flash  crowds,  device  failures,  misconfigurations,  and  so  on.  DoS  attacks  serve 
as  the  canonical  example  attack  in  this  study. 

Traditionally,  network-wide  anomaly  detection  was  achieved  via  inverting  the  noisy  rout¬ 
ing  matrix,  a  technique  known  as  tomography  (Zhang  et  al.,  2005):  since  X  Y  (A"'') 


-1 


this  technique  recovers  an  approximate  flow  TM  from  which  anomalies  can  be  detected  by 
simply  thresholding.  More  recently  anomography  techniques  have  emerged  which  detect 
anomalies  flow  volumes  directly  from  the  (cheaper  to  monitor)  link  measurements. 


2.3. 1.2  Subspace  Method  for  Anomaly  Detection 


We  briefly  summarize  the  PCA-based  anomaly  detector  introduced  by  Lakhina  et  al. 


(2004a).  The  authors  observed  that  high  levels  of  traffic  aggregation  on  ISP  backbone  links 


cause  OD  flow  volume  anomalies  to  often  go  unnoticed  because  they  are  buried  within 
normal  traffic  patterns.  They  also  observed  that  although  the  measured  data  has  high  di¬ 
mensionality  N,  normal  traffic  patterns  lie  in  a  subspace  of  low  dimension  iP  -C  A.  Inferring 
this  normal  traffic  subspace  using  PCA  (which  hnds  the  principal  traffic  components)  makes 
it  easier  to  identify  volume  anomalies  in  the  remaining  abnormal  subspace.  For  the  Abilene 
Internet2  backbone  network  (depicted  in  Figure  [Tb]),  most  variance  can  be  captured  by  the 
hrst  K  =  4  principal  components.  By  comparison  the  network  has  A  =  54  bidirectional 
links. 

PCA  is  a  dimensionality  reduction  method  that  chooses  K  orthogonal  principal  compo¬ 
nents  to  form  a  A-dimensional  subspace  capturing  maximal  variance  in  the  data.  Let  Y 
be  the  centered  link  traffic  matrix,  i.e.,  with  each  column  of  Y  is  translated  to  have  zero 
mean.  The  principal  component  is  computed  as 


Vfc  =  arg  max 

w:|jw||=l 


fc-1 


•-E 


v,-v, 


w 
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where  the  matrix  in  between  the  centered  link  TM  and  the  candidate  direction  is  a  projection 
matrix  that  projects  data  onto  the  space  orthogonal  to  the  previously  computed  principal 
components.  Thus  the  principal  component  is  chosen  to  be  the  direction  that  captures 
the  maximum  amount  of  variance  in  the  data  that  is  unexplained  by  principal  components 
1, . . . ,  fc  —  1.  Equivalently,  the  principal  component  is  the  k^^  eigenvector  of  the  empirical 
covariance  of  the  centered  traffic  matrix. 

The  resulting  A-dimensional  subspace  spanned  by  the  hrst  K  principal  components 
^ i-.K  =  [vi,  V2, . . . ,  Vii']  is  the  normal  traffic  subspace  Sn  and  has  a  projection  matrix 
Pn  =  The  residual  (A  —  A)-dimensional  subspace  is  spanned  by  the  remaining 

principal  components  Y k+i-.n  =  [vic+i,  Vi^+2, . . . ,  vat].  This  space  is  the  abnormal  traffic 
subspace  Sa  with  a  corresponding  projection  matrix  =  Vx+i:ArV^_,_^.^  =  I  —  P„. 

Volume  anomalies  can  be  detected  by  decomposing  the  link  traffic  into  y(t)  =  yn{t)  -|- 
ya{t)  where  yn{t)  is  the  modeled  normal  traffic  and  ya{t)  is  the  residual  traffic,  corresponding 


to  projecting  y{t)  onto  Sn  and  Sa,  respectively.  Lakhina  et  al.  (2004a)  observed  that  a 


volume  anomaly  at  time  t  typically  results  in  a  large  change  to  ya{t),  which  can  be  detected 
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Figure  2.6:  The  Abilene  network  topology.  PoPs  AM5  Figure  2.7:  Links  used  for  data 
and  A  are  located  together  in  Atlanta;  the  former  is  poisoning, 
displayed  south-east  to  highlight  its  connectivity. 


by  thresholding  the  squared  prediction  error  ||ya(t)||^  against  Qi3,  the  Q-statistic  at  the  I  — (3 
conhdence  level  described  below  ([Jackson  and  Mudholkar|  1979). 

That  is,  the  PCA-based  detector  classihes  a  link  measurement  vector  as 


c(yW)  = 


{anomalous, 
innocuous. 


|ya 

\ya 


^<Q0 


(2.3) 


While  others  have  explored  more  efficient  distributed  variations  of  this  approach  (Huang 


et  ah 

2007 

Li  et  ah 

(2004£ 

0. 

et  ah,  2007  Li  et  ah,  2006a|[b),  we  focus  on  the  basic  method  introduced  by  [Lakhina  et  ah 


The  Q-Statistic.  The  statistical  test  for  the  residual  function,  known  as  the  Q-statistic 
is  due  to  Jackson  and  Mudholkar  (1979).  The  statistic  is  computed  as  a  function  = 
Q/3{Xk+i,  ■  ■  ■ ,  ^n),  of  the  (A^  —  K)  non-principal  eigenvalues  of  the  covariance  matrix. 
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=  Y/  {1,2,3}  , 

j=K+l 


where 
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and  C0  is  the  1  —  /9  percentile  in  the  standard  normal  distribution.  With  the  threshold 
this  statistical  test  can  guarantee  that  the  false  alarm  probability  is  no  more  than  (3  (under 
assumptions  on  the  near-normality  of  the  traffic  data). 

2.3.2  Poisoning  Strategies 

Consider  an  adversary  who  knows  an  ISP  uses  a  PCA-based  anomaly  detector.  We  take 
the  point  of  view  of  the  attacker  whose  goal  is  to  evade  detection.  The  adversary  poisons 
the  training  data  so  the  detector  learns  a  distorted  set  of  principal  components.  When  the 
attacker  later  launches  an  attack,  the  PCA-based  detector  will  fail  to  detect  it,  as  a  result 
of  the  poisoning.  In  this  section,  we  propose  a  number  of  data  poisoning  schemes,  each 
designed  to  increase  the  variance  of  the  traffic  used  in  training. 

2. 3. 2.1  The  Contamination  Assumption 

The  adversary’s  goal  is  to  launch  a  Denial  of  Service  (DoS)  attack  on  some  victim  and  to 
have  the  attack  traffic  successfully  cross  an  ISP’s  network  without  being  detected.  Figure |2T| 
illustrates  a  simple  PoP-to-PoP  topology.  The  DoS  traffic  needs  to  traverse  from  a  source 
ingress  point-of-presence  (PoP)  node  D  to  a  sink  egress  PoP  B  of  the  ISP.  Before  launching 
a  DoS  attack,  the  attacker  poisons  the  detector  for  a  period  of  time,  by  injecting  additional 
traffic,  chaff,  along  the  OD  flow  (he.,  the  D-to-B  flow)  that  he  eventually  intends  to  attack. 
This  kind  of  poisoning  activity  is  possible  if  the  adversary  gains  control  over  clients  of  an 
ingress  PoP  or  if  the  adversary  compromises  a  router  (or  set  of  routers)  within  the  ingress 
PoP.  For  a  poisoning  strategy,  the  attacker  needs  to  decide  how  much  chaff  to  add,  and 
when  to  do  so.  These  choices  are  guided  by  the  amount  of  information  available  to  the 
attacker. 

Attacks  Exploiting  Increasing  Adversarial  Information.  We  consider  poisoning 
strategies  in  which  the  attacker  has  increasing  amounts  of  information  at  his  disposal.  The 
weakest  attacker  is  one  that  knows  nothing  about  the  traffic  at  the  ingress  PoP,  and  adds 
chaff  randomly  (called  an  uninformed  attack).  An  intermediate  case  is  when  the  attacker 
is  partially  informed.  Here  the  attacker  knows  the  current  volume  of  traffic  on  the  ingress 
link(s)  that  he  intends  to  inject  chaff  on.  Because  many  networks  export  SNMP  records, 
an  adversary  might  intercept  this  information,  or  possibly  monitor  it  himself  (he.,  in  the 
case  of  a  compromised  router).  We  call  this  type  of  poisoning  a  locally-informed  attack. 
Although  exported  data  from  routers  may  be  delayed  in  reaching  the  adversary,  we  consider 
the  case  of  minimal  delay  for  simplicity. 

In  a  third  scenario,  the  attacker  is  globally-informed  because  his  global  view  over  the 
network  enables  him  to  know  the  traffic  levels  on  all  network  links.  Moreover,  we  assume 
this  attacker  has  knowledge  of  future  traffic  link  levels.  (Recall  that  in  the  locally-informed 
scheme,  the  attacker  only  knows  the  current  traffic  volume  of  a  link.)  Although  these 
attacker  capabilities  are  very  unlikely,  we  include  this  in  our  study  in  order  to  understand 
the  limits  of  variance  injection  poisoning  schemes. 
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Attacks  With  Distant  Time  Horizons.  Poisoning  strategies  can  also  vary  according 
to  the  time  horizon  over  which  they  are  carried  out.  Most  studies  on  the  PCA-subspace 
method  use  a  one  week  training  period,  so  we  assume  that  PCA  is  retrained  each  week. 
Thus  the  principal  components  (PCs)  used  in  any  week  m  are  those  learned  during  week 
m  —  1  with  any  detected  anomalies  removed.  Thus  for  our  poisoning  attacks,  the  adversary 
inserts  chaff  along  the  target  OD  flow  throughout  the  one  week  training  period.  We  also 
consider  a  long-term  attack  in  which  the  adversary  slowly,  but  increasingly,  poisons  the  PCs 
over  several  weeks,  by  adding  small  amounts  of  chaff,  in  gradually  increasing  quantities.  We 
call  this  the  Boiling  Frog  poisoning  method  after  the  folk  tale  that  one  can  boil  a  frog  by 
slowly  increasing  the  water  temperature  over  timej^ 

Adversarial  Control.  We  assume  the  adversary  does  not  have  control  over  existing  traffic 
(he.,  he  cannot  delay  or  discard  traffic).  Similarly,  the  adversary  cannot  submit  false 
SNMP  reports  to  PCA.  Such  approaches  to  poisoning  are  more  conspicuous  because  the 
inconsistencies  in  SNMP  reporting  from  neighboring  PoPs  could  expose  the  compromised 
router. 

Remark  8.  This  study  focuses  on  non- distributed  poisoning  of  DoS  detectors.  Distributed 
poisoning  that  aims  to  evade  a  DoS  detector  is  also  possible;  our  globally-informed  poisoning 
strategy  is  an  example,  as  the  adversary  has  control  over  all  network  links.  We  focus  on  DoS 
for  a  three  reasons.  In  the  first-ever  study  on  this  topic,  we  aim  to  solve  the  basic  problem 
before  tackling  a  distributed  version.  Second,  we  point  out  that  results  on  evasion  via  non- 
distributed  poisoning  are  stronger  than  distributed  poisoning  results:  the  DDoS  attacker  can 
monitor  and  influence  many  more  links  than  the  DoS  attacker.  Hence  a  DoS  poisoning 
scenario  is  stealthier  than  a  DDoS  one.  Finally,  while  the  main  focus  of  current  PCA-based 


systems  is  the  detection  of  DoS  attacks  (Lakhina  et  al. 

,  200fa\b\c. 

2005a 

),  the  application 

of  PCA  to  detecting  DDoS  attacks  has  so  far  been  limited  ( cf.  Lakhina  et  al. 

2005b 

). 

For  each  of  these  scenarios  of  different  information  available  to  the  adversary,  we  now 
outline  specific  poisoning  schemes.  In  each  scheme,  the  adversary  decides  on  the  quantity 
Ct  of  chaff  to  add  to  the  target  flow  time  series  at  a  time  t.  Each  strategy  has  an  attack 
parameter  9,  which  controls  the  intensity  of  the  attack.  For  each  scenario,  we  present  only 
one  specific  poisoning  scheme. 

2. 3. 2. 2  Uninformed  Chaff  Selection 

At  each  time  t,  the  adversary  decides  whether  or  not  to  inject  chaff  according  to  a 
Bernoulli  random  variable  with  parameter  0.5.  If  he  decides  to  inject  chaff,  the  amount  of 
chaff  added  is  of  size  6,  i.e.,  Ct  =  9.  This  method  is  independent  of  the  network  traffic  since 
our  attacker  is  uninformed,  and  so  the  variance  of  the  poisoned  traffic  is  increased  by  the 
variance  of  the  chaff.  The  choice  of  a  fair  coin  with  p  =  0.5  maximizes  the  variance  of  the 
chaff  as  0^/4,  which  in  general  would  be  p(l  —  p)9‘^.  We  call  this  the  Random  scheme. 

"*^Note  that  there  is  nothing  inherent  in  the  choice  of  a  one-week  poisoning  period.  For  a  general  SML 
algorithm,  our  strategies  would  correspond  to  poisoning  over  one  training  period  (whatever  its  length)  or 
multiple  training  periods. 
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2. 3. 2. 3  Locally-Informed  Chaff  Selection 

The  attacker’s  goal  is  to  increase  traffic  variance,  on  which  the  PCA  detector’s  model 
is  based.  In  the  locally-informed  scenario,  the  attacker  knows  the  volume  of  traffic  in  the 
ingress  link  he  controls,  ys{t)-  Hence  this  scheme  elects  to  only  add  chaff  when  the  existing 
traffic  is  already  reasonably  large.  In  particular,  we  add  chaff  when  the  traffic  volume  on  the 
link  exceeds  a  parameter  a  (we  typically  use  the  mean,  assuming  that  the  data  is  reasonably 
stationary  in  the  sense  that  the  mean  does  not  change  quickly  over  time).  The  amount  of 
chaff  added  is  Ct  =  (max  {0,  |/5(t)  —  a}}Y .  In  other  words,  we  take  the  difference  between 
the  link  traffic  and  a  parameter  a  and  raise  it  to  6.  In  this  scheme  (called  Add-More-If- 
Bigger),  the  further  the  traffic  is  from  the  average  load,  the  larger  the  deviation  of  chaff 
inserted. 

2. 3. 2. 4  Globally-Informed  Chaff  Selection 

The  globally-informed  scheme  captures  an  omnipotent  adversary  with  full  knowledge  of 
Y,  A,  and  the  future  measurements  yt,  and  who  is  capable  of  injecting  chaff  into  any  network 
flow  during  training.  In  the  poisoning  schemes  above,  the  adversary  can  only  inject  chaff 
along  a  single  compromised  link,  whereas  in  this  scenario,  the  adversary  can  inject  chaff 
along  any  link.  We  formalize  the  problem  of  selecting  a  link  n  to  poison,  and  selecting  an 
amount  of  chaff  Ctn  as  an  optimization  problem  that  the  adversary  solves  to  maximize  the 
chances  of  evasion.  Although  these  globally-informed  capabilities  are  unrealistic  in  practice, 
we  analyze  globally-informed  poisoning  in  order  to  understand  the  limits  of  variance  injection 
methods  and  to  gain  insight  into  the  poisoning  strategies  that  exploit  limited  capabilities. 

Ideal  Objective:  The  PCA  Evasion  Problem.  In  the  PCA  Evasion  Problem  an 
adversary  wishes  to  launch  an  undetected  DoS  attack  of  volume  S  along  flow  /  at  future 
time  t.  If  the  vector  of  link  volumes  at  time  t  is  yt,  where  the  tilde  distinguishes  this  future 
measurement  from  past  training  data  Y,  then  the  vectors  of  anomalous  DoS  volumes  are 
given  by  y(  =  yt  +  S*Af.  Denote  by  C  the  matrix  of  link  traffic  injected  into  the  network  by 
the  adversary  during  training.  Then  the  PCA-based  anomaly  detector  is  trained  on  altered 
link  traffic  matrix  Y-|-C  to  produce  the  mean  traffic  vector  fi,  the  top  K  eigenvectors  Afi-x, 
and  the  squared  prediction  error  threshold  Qg.  The  adversary’s  objective  is  to  enable  as 
large  a  DoS  attack  as  possible  (maximizing  5)  by  choosing  an  appropriate  C.  The  PCA 
Evasion  Problem  corresponds  to  solving  the  following  program: 

max  S 

(5eM, 

s.t.  (M,V,g^)  =  PCA(Y  +  C) 

l|c||i  <  e 

Ctn  >0  Vt,  n  , 

where  6  is  an  attacker-tunable  parameter  constraining  total  chaff.  The  hrst  constraint 
represents  the  output  of  PCA  {i.e.,  does  not  constrain  the  program’s  solution).  The  second 
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constraint  guarantees  evasion  by  requiring  that  the  contaminated  link  volumes  at  time  t  be 
classified  as  innocuous  (c/.  Equation  [2^.  The  remaining  constraints  upper-bound  the  total 
chaff  volume  by  6  and  constrain  the  chaff  to  be  non-negative,  corresponding  to  the  level  of 
adversarial  control  and  the  contamination  assumption  that  no  negative  chaff  may  be  added 
to  the  network. 


Relaxations.  Unfortunately,  the  above  optimization  seems  difficult  to  solve  analytically. 
Thus  we  relax  the  problem  to  obtain  a  tractable  analytic  solution. 

First  the  above  objective  seeks  to  maximize  the  attack  direction  Aj’s  projected  length 
in  the  normal  subspace,  maxcgRTxF  HV^^AjH^.  Next,  we  restrict  our  focus  to  traffic 
processes  that  generate  spherical  A-rank  link  traffic  covariance  matrices]^  This  property 
implies  that  the  eigen-spectrum  consists  of  K  ones  followed  by  all  zeroes.  Such  an  eigen- 
spectrum  allows  us  to  approximate  the  top  eigenvectors  Yi-,k  in  the  objective,  with  the 
matrix  of  all  eigenvectors  weighted  by  their  corresponding  eigenvalues  AV.  We  can  thus 
convert  the  PCA  evasion  problem  into  the  following  optimization: 

max  ||(Y  +  C)Adl  (2.4) 

CeRTxF  '  J\\2 

s.t.  ||C||i<0 

Ctn  >0  yt,n  . 

Solutions  to  this  optimization  are  obtained  by  a  standard  Projection  Pursuit  method  from 
optimization:  iteratively  take  a  step  in  the  direction  of  the  objective’s  gradient  and  then 
project  onto  the  feasible  set.  Finally  the  iteration  can  be  initialized  by  relaxing  the  Li 
constraint  on  the  chaff  matrix  to  the  analogous  L2  constraint  and  dropping  the  remaining 
constraints.  This  produces  a  differentiable  program  which  can  be  solved  using  standard 
Lagrangian  techniques  to  initialize  the  iteration. 


Relation  to  Uninformed  and  Locally  Informed  Schemes.  The  relaxed  solution  to 
the  PCA  evasion  problem  yields  an  interesting  insight  relating  to  the  uninformed  and  locally- 
informed  chaff  selection  methods.  Recall  that  the  adversary  is  capable  of  injecting  chaff 
along  any  flow.  One  could  imagine  that  it  might  be  useful  to  inject  chaff  along  an  OD  flow 
whose  traffic  dominates  the  choice  of  principal  components  {i.e.,  an  elephant  flow),  and 
then  send  the  DoS  traffic  along  a  different  flow  (that  possibly  shares  a  subset  of  links  with 


the  poisoned  OD  flow).  However  Equation  (2.4)  indicates  that  the  best  (relaxed)  strategy 
to  evade  detection  is  to  inject  chaff  only  along  the  links  Af  associated  with  the  target  flow 
/.  This  follows  from  the  form  of  the  initializer  oc  YA/-AJ  (obtained  from  the  L2 
relaxation  for  initialization)  as  well  as  the  form  of  the  projection  and  gradient  steps.  In 
particular,  all  iterates  preserve  the  property  that  the  solution  only  injects  chaff  along  the 
target  flow. 


^While  the  spherical  assumption  does  not  hold  in  practice,  the  assumption  of  low-rank  traffic  matrices 
is  met  by  published  datasets  (Lakhina  et  al.  2004a). 
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In  fact,  the  only  difference  between  our  globally- informed  solution  and  the  locally- 
informed  scheme  is  that  the  former  uses  information  about  the  entire  traffic  matrix  Y 
to  determine  chaff  allocation  along  the  flow  whereas  the  latter  use  only  local  information. 

This  result  adds  credence  to  the  intuition  that  in  a  chaff-constrained  poisoning  attack, 
all  available  chaff  should  be  inserted  along  the  target  flow. 

2. 3. 2. 5  Boiling  Prog  Attacks 

In  the  above  attacks,  poisoning  occurs  during  a  single  week.  We  next  consider  a  long-term 
attack  in  which  the  adversary  slowly,  but  increasingly,  poisons  the  principal  components  over 
several  weeks,  starting  with  the  second  week  by  adding  small  amounts  of  chaff,  in  gradually 
increasing  quantities.  This  kind  of  poisoning  approach  is  useful  for  adversaries  that  plan 
DoS  attacks  in  advance  of  special  events  (like  the  Olympics,  the  World  Cup  soccer  finals, 
the  scheduled  release  of  a  new  competing  product,  etc.) 

Boiling  Frog  poisoning  can  use  any  of  the  preceding  chaff  schemes  to  select  q.  The 
amount  of  poisoning  is  increased  over  the  duration  of  the  Causative  attack  as  follows.  We 
initially  set  the  attack  parameter  6i  to  be  zero,  so  that  in  the  first  week,  no  chaff  is  added 
to  the  training  data  and  PCA  is  trained  on  a  week  of  ‘clean’  data  to  establish  a  baseline 
model  (representing  the  state  of  the  detector  prior  to  the  start  of  poisoning).  Over  the 
course  of  the  second  week,  the  target  flow  is  injected  with  chaff  generated  using  parameter 
6*2.  At  the  week’s  end,  PCA  is  retrained  on  that  week’s  data  with  any  anomalies  detected 
by  PCA  during  that  week,  excluded  from  the  week’s  training  set.  This  process  continues 
with  parameter  9t  >  6t-i  used  for  week  t. 

Although  PCA  is  retrained  from  scratch  each  week,  the  training  data  includes  only  those 
events  not  flagged  as  anomalous  by  the  previous  detector.  Thus,  each  successive  week  will 
contain  additional  malicious  training  data,  with  the  process  continuing  until  the  week  of  the 
DoS  attack,  when  the  adversary  stops  injecting  chaff. 

The  effect  of  this  scheme  is  to  slowly  rotate  the  normal  subspace,  injecting  low  levels  of 
chaff  relative  to  the  previous  week’s  traffic  levels  so  that  PCA’s  rejection  rates  stay  low  and 
a  large  portion  of  the  present  week’s  poisoned  traffic  matrix  is  trained  on  for  the  proceeding 
week’s  model. 


2.3.3  ANTIDOTE:  A  Robust  Defense 


To  defend  against  the  above  poisoning  attacks  on  PCA-based  anomaly  detection,  we 
explore  techniques  from  Robust  Statistics.  Such  methods  are  less  sensitive  to  outliers,  and 
as  such  are  ideal  defenses  against  variance  injection  schemes  that  perturb  data  to  increase 
variance  along  the  target  flow.  There  has  previously  been  two  broad  approaches  to  make 
PCA  robust:  the  first  computes  the  principal  components  as  the  eigenspectrum  of  a  robust 
estimate  of  the  covariance  matrix  (|Devhn  et  ah  1981),  while  the  second  approach  searches 


for  directions  that  maximize  a  robust  scale  estimate  of  the  data  projection  (Croux  et  al. 


2007).  We  explore  one  of  the  latter  methods  as  a  counter-measure  to  poisoning.  After 


describing  the  method,  we  propose  a  new  threshold  statistic  that  can  be  used  for  any  PCA- 
based  method  including  robust  PCA.  Robust  PCA  and  the  new  robust  Laplace  threshold 
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together  form  a  new  network-wide  traffic  anomaly  detection  method,  named  antidote, 
that  is  less  sensitive  to  our  poisoning  attacks. 


2. 3. 3.1  Intuition 


To  mitigate  the  effect  of  variance  injection  poisoning  attacks,  we  need  a  learning  algo¬ 
rithm  that  is  stable  in  spite  of  data  contamination;  he.,  a  small  amount  of  data  contami¬ 
nation  should  not  dramatically  change  the  model  produced  by  the  algorithm.  This  concept 
of  stability  has  been  studied  in  the  held  of  Robust  Statistics  where  robustness  is  used  to 
describe  the  notion  of  stability.  In  particular,  there  have  been  several  approaches  to  devel¬ 
oping  robust  PCA  algorithms  that  construct  a  low  dimensional  subspace  that  captures  most 


of  the  data’s  dispersior^  and  are  stable  under  data  contamination  ( 

Croux  and  Ruiz-Gazen 

2005 

Croux  et  al.f 

2007 

Devlin  et  ah , 

1981 

Li  and  Chen 

1985 

Maronna 

2005 

)• 

The  robust  PCA  algorithms  we  considered  search  for  a  unit  direction  v  on  which  the 
data  projections  maximize  some  measure  of  univariate  dispersion  S  (■);  that  is. 


V  G  arg  max  S  ( Ya)  . 

||a||2=l 


(2.5) 


The  standard  deviation  is  the  dispersion  measure  used  by  PCA;  i.e.,  (ri, r2, . . . , r^)  = 


A  ~  I”)  ■  However,  standard  deviation  is  sensitive  to  outliers  making  PCA  non- 

robust  to  contamination.  Robust  PCA  algorithms  instead  use  measures  of  dispersion  based 


on  the  concept  of  robust  projection  pursuit  (RPP)  estimators  (Li  and  Chen,  1985).  As  is 
shown  by  Li  and  Chen  (1985),  RPP  estimators  achieve  the  same  breakdown  point^  as  their 
dispersion  measure  as  well  as  being  qualitatively  robust;  he.,  the  estimators  are  stable. 

However,  unlike  the  eigenvector  solutions  that  arise  in  PCA,  there  is  generally  no  effi¬ 
ciently  computable  solution  for  the  maximizers  of  robust  dispersion  measures  and  so  the 
solutions  must  be  approximated.  Below,  we  describe  the  PCA-grid  algorithm,  a  successful 


method  for  approximating  robust  PCA  subspaces  developed  by 

Groux  et  ah 

(2001 

").  Among 

the  projection  pursuit  techniques  we  considered  ( 

Croux  and  Ruiz-Gazen 

2005; 

Maronna 

2005),  PCA-grid  proved  to  be  most  resilient  to  our  poisoning  attacks.  It  is  worth  em¬ 


phasizing  that  the  procedure  described  in  the  next  section  is  simply  a  technique  for  ap¬ 
proximating  a  projection  pursuit  estimator  and  does  not  itself  contribute  to  the  algorithm’s 
robustness — that  robustness  comes  from  the  dehnition  of  the  projection  pursuit  estimator 
in  Equation  (2.5). 

To  better  understand  the  efficacy  of  a  robust  PCA  algorithm,  we  demonstrate  by  example 
the  effect  our  poisoning  techniques  have  on  the  PCA  algorithm  and  contrast  them  with  the 
effect  on  the  PCA-grid  algorithm.  In  Figure  2^,  we  see  the  impact  of  a  globally  informed 
poisoning  attack  on  both  algorithms.  Initially,  the  ‘clean’  data  was  clustered  in  an  ellipse. 
In  the  hrst  plot,  we  see  that  both  algorithms  construct  reasonable  estimates  for  the  center 
and  hrst  principal  component  for  this  data. 


® ‘Dispersion’  is  an  alternative  term  for  variation  since  the  later  is  often  associated  with  statistical  varia¬ 
tion.  By  a  dispersion  measure  we  mean  a  statistic  that  measures  the  variability  or  spread  of  a  variable. 

^The  breakdown  point  of  an  estimator  is  the  (asymptotic)  fraction  of  the  data  an  adversary  must  control 
in  order  to  arbitrarily  change  an  estimator,  and  as  such  is  a  common  measure  of  statistical  robustness. 
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Figure  2.8:  Here  the  data  has  been  projected  into  the  2-dimensional  space  spanned  by  the 
1st  principal  component  and  the  direction  of  the  attack  flow  #118  in  the  Abilene  dataset. 
The  effect  on  the  1st  principal  components  of  PCA  and  PCA-grid  is  shown  under  a  globally 
informed  attack. 


However,  in  the  second  plot,  we  see  that  a  large  amount  of  poisoning  dramatically 
perturbs  some  of  the  data  and  as  a  result  the  PCA  subspace  is  dramatically  shifted  toward 
the  target  flow’s  direction  (the  y-axis  in  this  example).  Due  to  this  shift,  DoS  attacks  along 
the  target  flow  will  be  less  detectable.  Meanwhile,  the  subspace  of  PCA-grid  is  noticeably 
less  affected. 


2.3.3.2  PCA-GRID 


The  PCA-grid  algorithm  introduced  by  Croux  et  ah  (2007)  is  a  projection  pursuit  tech¬ 
nique  as  described  above.  It  hnds  a  iC-dimensional  subspace  that  approximately  maximizes 
S  (•),  a  robust  measure  of  dispersion,  for  the  data  Y  as  in  Equation  (2.5).  The  hrst  step  of 
describing  the  algorithm  is  to  specify  the  robust  measure  of  dispersion.  We  use  the  Median 
Absolute  Deviation  (MAD)  over  other  possible  choices  for  S  (■).  For  scalars  ri, . . .  ,r„  the 
MAD  is  dehned  as 


gMAD  =  uj  ■  medianj6[„]  { jr*  -  medianj6[„]{rj}| }  , 

where  the  coefficient  u  =  1.486  ensures  asymptotic  consistency  on  normal  distributions. 

The  next  step  is  to  choose  an  estimate  of  the  data’s  central  location.  In  PCA,  this 
estimate  is  simply  the  mean  of  the  data.  However,  the  mean  is  not  robust,  so  we  center  the 
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data  using  the  spatial  median  instead: 


(Y)  e  argminV]  ||yi  - 

i=l 


2  ’ 


which  involves  a  convex  optimization  that  is  efficiently  solved  (see  e.g.,  Hossjer  and  Croux 


1995). 


The  original  projection  pursuit  technique  for  robust  PCA  was  first  proposed  by  Li  and 


Chen  (1985),  however  their  method  was  complicated  and  inefficient.  The  desirable  statistical 


properties  of  consistency,  asymptotic  normality  (Cui  et  ah,  2003  Li  and  Chen,  1985)  and 


robustness  in  terms  of  influence  functions  and  breakdown  points  (Croux  and  Ruiz-Gazen 


efficient  implementation  to  approximate  the  above  objective,  however  Croux  et  ah  (2007) 


observed  that  for  all  datasets  the  method  of  Croux  and  Ruiz-Gazen  (1996)  implodes  in  the 


2005 

)  have  all  been  shown  for  these  methods. 

Groux  and  Ruiz-Gazen 

(1996 

)  introduced  an 

presence  of  many  variables:  the  lower  half  of  the  ‘eigenvalues’ — the  estimates  of  scale- 


are  identically  zero.  Groux  et  al.  (2007)  propose  the  PCA-grid  algorithm  as  an  efficient 


implementation  that  does  not  suffer  from  this  downward  bias  of  the  scale  estimates. 

Given  a  dispersion  measure  and  location  estimate,  PCA-grid  finds  a  (unit)  direction  v 
that  is  an  approximate  solution  to  Equation  (2.5).  The  PCA-grid  algorithm  uses  a  grid- 


search  for  this  task.  Namely,  suppose  we  want  to  find  the  best  candidate  between  some  pair 
of  unit  vectors  ai  and  a2  (a  2-dimensional  search  space).  The  search  space  is  the  unit  circle 
parameterized  by  (f)  as  =  cos(0)ai  -|-  sin(0)a2  with  0  G  [— 7r/2, 7r/2].  The  grid  search 

splits  the  domain  of  0  into  a  mesh  of  Q  -|-  1  candidates  0^  =  1  ~  ^  ~  0, . . . ,  Q. 

Each  candidate  vector  a,^^.  is  assessed  and  the  one  that  maximizes  S  (Ya,^^.)  is  chosen  as  the 
approximate  maximizer  a. 

To  search  in  a  more  general  Y-dimensional  space,  the  grid  search  iteratively  refines 
its  current  best  candidate  a  by  performing  a  2-dimensional  grid  search  between  a  and 
each  of  the  unit  directions  e*  in  turn.  With  each  iteration,  the  range  of  angles  considered 
progressively  narrows  around  a  to  better  explore  its  neighborhood.  This  procedure  (outlined 
in  Algorithm  approximates  the  direction  of  maximal  dispersion  analogous  to  a  principal 
component  in  PGA. 

To  find  the  iP-dimensional  subspace  {vj  |  vjvj  =  6ij}  that  maximizes  the  dispersion 
measure,  the  Grid-Seargh  is  repeated  iP-times.  After  each  repetition,  the  data  is  deflated 
to  remove  the  dispersion  captured  by  the  last  direction  from  the  data.  This  process  is 
detailed  in  Algorithmic 


2. 3. 3. 3  Robust  Laplace  Threshold 

In  addition  to  the  robust  PCA-grid  algorithm,  we  also  use  a  robust  estimate  for  its 
residual  threshold  in  place  of  the  Q-statistic  described  in  Section  2.3.1.2[  Using  the  Q- 
statistic  as  a  threshold  was  motivated  by  an  assumption  of  normally  distributed  residu¬ 
als  (Jackson  and  Mudholkar,  1979).  However,  we  found  that  the  residuals  for  both  the 


PGA  and  PCA-grid  subspaces  were  empirically  non-normal  leading  us  to  conclude  that 
the  Q-statistic  is  a  poor  choice  for  our  detection  threshold.  Instead,  to  account  for  the 
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Algorithm  1  Grid-Search(Y) 

Require:  Y  is  a  T  x  A  matrix 
1:  Let;  V  =  ei; 

2:  for  i  =  1  to  G  do 
3:  for  j  =  1  to  A  do 

4:  for  fc  =  0  to  Q  do 

5:  Let:  (j)k  =  §  -  1^; 

6:  Let:  =  cos(0fc)a  +  sin(0fc)eJ; 

7:  if  S  (Ya^j.)  >  S(Yv)then 

8:  Assign:  V  ^  a^^; 

9:  Return;  v; 


Algorithm  2  PCA-grid(Y,  A) 

1:  Center  Y;  Y^  Y-c(Y); 

2:  for  i  =  1  to  A  do 
3:  Vi  ^  GrID-SeARCH(Y); 

4:  Y  projection  of  Y  onto  the  complement  of  v*; 

5:  end  for 

6:  Return  subspace  centered  at  c(Y)  with  principal  directions 


outliers  and  heavy-tailed  behavior  we  observed  from  our  method’s  residuals,  we  choose  our 
threshold  as  the  1  —  (3  quantile  of  a  Laplace  distribution.  It  is  important  to  note  that  while 
we  do  not  believe  the  residuals  to  be  distributed  according  to  a  Laplace,  a  Laplace  model 
better  models  the  heavy-tailed  nature  observed  in  the  data.  Our  detector,  antidote  is  the 
combination  of  the  PCA-grid  algorithm  and  the  Laplace  threshold.  The  non-normality  of 


the  residuals  has  also  been  recently  pointed  out  by 

Brauckhoff  et  al. 

(2009). 

As  with  the  previous  Q-statistic  method  described  in  Section 

2. 3. 1.2,  we  select  our 

threshold  Ql,i3  as  the  I  —  (3  quantile  of  a  parametric  distribution  fit  to  the  residuals  in 
the  training  data.  However,  instead  of  the  normal  distribution  assumed  by  the  Q-statistic, 
we  use  the  quantiles  of  a  Laplace  distribution  specihed  by  a  location  parameter  c  and  a 
scale  parameter  b.  Critically,  though,  instead  of  using  the  mean  and  standard  deviation,  we 
robustly  fit  the  distribution’s  parameters.  We  estimate  c  and  b  from  the  residuals  ||ya(t)|P 
using  robust  consistent  estimates  of  location  (median)  and  scale  (MAD) 


c 

b 


median  (||ya(t) IP)  , 


1 

y2P-i(0.75) 


median{|||y„(t)|p 


where  P  ^{q)  is  the  quantile  of  the  standard  Laplace  distribution.  The  Laplace  quantile 
function  has  the  form  P~^{q)  =  c  +  b-  k{q)  for  some  k{q).  Thus,  our  threshold  only  depends 

linearly  on  the  (robust)  estimates  c  and  b  making  the  threshold  itself  robust.  This  form 
is  also  shared  by  the  normal  quantiles  (differing  only  in  the  function  k),  but  because  non- 
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Figure  2.9:  Histograms  of  the  residuals  for  the  original  PCA  algorithm  (left)  and  the  PCA- 
GRID  algorithm  (the  largest  residual  is  excluded  as  an  outlier).  Red  and  blue  vertical 
lines  demarcate  the  threshold  selected  using  the  Q-statistic  and  the  Laplace  threshold, 
respectively. 


robust  estimates  for  c  and  b  are  implicitly  used  by  the  Q-statistic,  it  is  not  robust.  Further, 
by  choosing  a  heavy-tailed  distribution  like  the  Laplace,  the  quantiles  are  more  appropriate 
for  the  heavy-tails  we  observed. 

Empirically,  the  Laplace  threshold  appears  to  be  better  suited  for  thresholding  the  resid¬ 
uals  of  our  robust  models  than  the  Q-statistic.  As  can  be  seen  in  Figure  2.9,  both  the 


Q-statistic  and  the  Laplace  threshold  produce  a  reasonable  threshold  on  the  residuals  of 
the  PCA  algorithm  but  only  the  Laplace  threshold  produces  a  reasonable  threshold  for  the 
residuals  of  the  PCA-grid  algorithm;  the  Q-statistic  vastly  under-estimates  the  spread  of 
the  residuals.  As  was  consistently  seen  throughout  our  experiments,  the  Laplace  threshold 
proved  to  be  a  more  reliable  threshold  than  the  Q-statistic  for  robust  PCA. 


2.3.4  Methodology 

We  now  describe  in-depth  the  experimental  methodology  used  to  measure  the  perfor¬ 
mance  of  the  proposed  poisoning  strategies  for  PCA  and  the  performance  of  antidote  as 
a  counter-measure  for  variance  injection  attacks. 


2. 3. 4.1  Traffic  Data 


We  use  OD  flow  data  collected  from  the  Abilene  (Internet2  backbone)  network  to  simu¬ 
late  attacks  on  PCA-based  anomaly  detection.  Data  was  collected  over  an  almost  continuous 
6  month  period  from  March  1,  2004  through  September  10,  2004  (|Zhang  et  ah  2005).  Each 
week  of  data  consists  of  2016  measurements  across  all  144  network  OD  flows  binned  into  5 
minute  intervals.  At  the  time  of  collection  the  network  consisted  of  12  PoPs  and  15  inter- 
PoP  links.  54  virtual  links  are  present  in  the  data  corresponding  to  two  directions  for  each 
inter-PoP  link  and  an  ingress  and  egress  link  for  each  PoP.  See  Figure  |2.6|  for  the  Abilene 
network  topology. 
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2. 3. 4. 2  Validation 


To  evaluate  the  PCA  subspace  method  and  antidote  in  the  face  of  poisoning  and 
DoS  attacks,  we  use  two  consecutive  weeks  of  data — the  hrst  for  training  and  the  second 
for  testing.  The  poisoning  occurs  throughout  the  training  phase,  while  the  attack  occurs 
during  the  test  week.  An  alternate  method  (described  in  Section  2. 3. 4. 3  below)  is  needed 
for  the  Boiling  Frog  scheme  where  training  and  poisoning  occur  over  multiple  weeks.  Our 
performance  metric  for  measuring  the  success  of  the  poisoning  strategies  is  through  their 
impact  on  a  PCA-based  detector’s  false  negative  rate  (FNR).  The  FNR  is  the  ratio  of  the 
number  of  successful  evasions  to  the  total  number  of  attacks  (he.,  the  attacker’s  success 
rate  is  PCA’s  FNR  rate).  We  also  use  Receiver  Operating  Characteristic  (ROC)  curves  to 
visualize  a  detection  method’s  trade-off  between  detection  rate  (TPR)  and  false  positive  rate 
(FPR). 

In  order  to  compute  the  FNRs  and  FPRs,  we  generate  synthetic  anomalies  according  to 
the  method  of  Lakhina  et  ah  (2004a)  and  inject  them  into  the  Abilene  data.  While  there  are 
disadvantages  to  this  method,  such  as  the  conservative  assumption  that  a  single  volume  size 
is  anomalous  for  all  flows,  we  adopt  it  for  the  purposes  of  relative  comparison  between  PCA 
and  Robust  PCA,  to  measure  relative  effects  of  poisoning,  and  for  consistency  with  prior 
studies.  We  use  week-long  training  sets,  as  such  a  time  scale  is  sufficiently  large  to  capture 


weekday  and  weekend  cyclic  trends  ( 

Ringberg  et  al. , 

2007 

),  and  previous  studies  operated 

on  this  same  time  scale  ( 

Lakhina  et  al. 

2004a 

) .  There  is 

nothing  inherent  to  our  method 

that  limits  its  use  to  this  time  scale;  our  methods  will  work  as  long  as  the  training  data  is 
poisoned  throughout.  Because  the  data  is  binned  in  5  minute  windows  (corresponding  to 
the  reporting  interval  of  SNMP),  a  decision  about  whether  or  not  an  attack  is  present  can 
be  made  at  the  end  of  each  5  minute  window;  thus  attacks  can  be  detected  within  5  minutes 
of  their  occurrence.  We  now  describe  the  method  of  Lakhina  et  al.  (2004a)  adopted  here. 

Starting  with  the  flow  traffic  matrix  X  for  the  test  week,  we  generate  a  positive  example 
(an  anomalous  OD  flow)  by  setting  flow  /’s  volume  at  time  t,  Wj,  to  be  a  large  value  known 
to  correspond  to  an  anomalous  flow  (replacing  the  original  traffic  volume  in  this  time  slot). 
This  valu^  is  dehned  (Lakhina  et  ah,  2004a)  to  be  1.5  times  a  cutoff  of  8  x  10^.  After 


multiplying  by  the  routing  matrix  A,  the  link  volume  measurement  at  time  t  is  anomalous. 
We  repeat  this  process  for  each  time  t  (each  5  minute  window)  in  the  test  week  to  generate 
a  set  of  2016  anomaly  samples  for  the  single  target  flow  /. 

In  order  to  obtain  FPRs,  we  generate  negative  examples  (benign  OD  flows)  as  follows.  We 
£t  the  data  to  an  exponentially  weighted  moving  average  (EWMA)  model  that  is  intended 
to  capture  the  main  trends  of  the  data  without  much  noise.  We  use  this  model  to  select 
which  points  in  time,  in  an  Abilene  flow’s  time  series,  to  use  as  negative  examples.  We 
compare  the  actual  observations  and  the  EWMA  model,  and  if  the  difference  is  small  (not 
in  the  flow’s  top  one  percentile)  for  a  particular  flow  at  a  particular  time,  Wj,  then  we 
label  the  measurement  X^j  as  “benign.”  We  do  this  across  all  flows;  when  we  hnd  time 
slots  where  all  flows  are  labeled  as  benign,  we  run  our  detectors  and  see  whether  or  not  they 


®The  cutoff  was  determined  by  fitting  a  basis  of  sinusoids  of  periods  7,  5,  3  days,  24,  12,  6,  3  and  1.5  hours 
to  flow  traffic  and  identifying  the  original  flow  volume  corresponding  to  a  steep  drop  to  the  rank-ordered 
residuals. 


45 


raise  an  alarm  for  those  time  slots. 

We  simulate  a  DoS  attack  along  every  flow  at  every  time,  one-at-a-time.  We  average 
FNRs  over  all  144  possible  anomalous  flows  and  all  2016  anomaly  times.  When  reporting  the 
effect  of  an  attack  on  traffic  volumes,  we  first  average  over  links  within  each  flow  then  over 
flows.  Furthermore  we  generally  report  average  volumes  relative  to  the  pre-attack  average 
volumes.  Thus  a  single  poisoning  experiment  was  based  on  one  week  of  poisoning  with  FNRs 
computed  during  the  test  week  that  includes  144  x  2016  samples  coming  from  the  different 
flows  and  time  slots.  Because  the  poisoning  is  deterministic  in  Add-More-If-Bigger  this 
experiment  was  run  once  for  that  scheme.  In  contrast,  for  the  Random  poisoning  scheme, 
we  ran  20  independent  repetitions  of  the  poisoning  experiment  to  average-out  the  effects  of 
randomness  in  each  individual  run. 

To  produce  the  ROC  curves,  we  use  the  squared  prediction  errors  produced  by  the 
detection  methods,  that  consist  of  anomalous  and  normal  examples  from  the  test  set.  By 
varying  the  method’s  threshold  (usually  fixed  as  the  Q-statistic  or  the  Laplace  threshold) 
from  — oo  to  oo  a  curve  of  possible  {FPR,TPR)  pairs  is  produced  from  the  set  of  SPE’s; 
the  Q-statistic  and  Laplace  threshold,  each  correspond  to  one  such  point  in  ROC  space. 
We  adopt  the  Area  Under  Curve  (AUC)  statistic  from  Information  Retrieval  to  directly 
compare  ROC  curves  since  one  curve  out  of  a  pair  of  curves  does  not  always  dominate  the 
other.  The  area  under  an  ROC  curve  of  detector  A  estimates  the  conditional  probability 

AUC{A)  ^  Pr(5PE^(yi)>RPE^(y2))  , 

given  anomalous  and  normal  random  link  volume  vectors  y^^  and  y2.  The  ideal  detector  has 
an  AUC  of  1,  while  the  random  predictor  achieves  an  AUC  of  0.5. 


2. 3. 4. 3  Single  Period  and  Boiling  Prog  Poisoning 


We  evaluate  the  effectiveness  of  our  attacker  strategies  using  weeks  20  and  21  from 
the  Abilene  dataset  to  simulate  the  Single- Training  Period  attacks.  The  PCA  algorithm  is 
trained  on  the  week  20  traffic  matrix  poisoned  by  the  attacker;  we  then  inject  attacks  during 
week  21  to  see  how  often  the  attacker  can  evade  detection.  We  select  these  particular  weeks 
because  PCA  achieved  the  lowest  FNRs  on  these  during  testing. 

To  test  the  Boiling  Frog  attack  we  simulate  traffic  matrix  data,  inspired  by  methods 


used  by  Lakhina  et  ah  (2004a).  Our  simulations  present  multiple  weeks  of  stationary  data 


to  the  adversary.  While  such  data  is  unrealistic  in  practice,  it  is  an  easy  case  on  which  PCA 
should  succeed.  Anomaly  detection  under  non-stationary  conditions  is  difficult  due  to  the 
learner’s  inability  to  distinguish  between  benign  data  drift,  and  adversarial  poisoning.  Thus 
demonstrated  flaws  of  PCA  in  the  stationary  case  constitute  strong  results.  We  decided 
to  validate  the  Boiling  Frog  attack  on  a  synthesized  multi- week  dataset,  because  the  6 


month  Abilene  dataset  of  Zhang  et  al.  (2005)  proved  to  be  too  non-stationary  for  PCA 
to  consistently  operate  well  from  one  week  to  the  next.  It  is  unclear  whether  the  non- 
stationarity  observed  in  this  data  is  prevalent  in  general  or  whether  it  is  an  artifact  of  the 
dataset. 

We  synthesize  a  multi-week  set  of  OD  flow  traffic  matrices,  with  stationarity  on  the  inter¬ 
week  level.  We  use  a  three  step  generative  procedure  to  model  each  OD  flow  separately  from 
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the  real-world  Abilene  dataset.  First  the  underlying  daily  cycle  of  the  OD  flow  /  time  series 
is  modeled  by  a  sinusoidal  approximation.  Then  the  times  at  which  the  flow  is  experiencing 
an  anomaly  are  modeled  by  a  Binomial  arrival  process  with  inter-arrival  times  distributed 
according  to  the  geometric  distribution.  Finally  Gaussian  white  noise  is  added  to  the  base 
sinusoidal  model  during  times  of  benign  OD  flow  traffic;  and  exponential  traffic  is  added  to 
the  base  model  during  times  of  anomalous  traffic.  We  next  describe  the  process  of  htting 
this  generative  model  to  the  week  20  Abilene  data  in  more  detail. 

In  step  1,  we  capture  the  underlying  cyclic  trends  via  Fourier  basis  functions.  We  use 
sinusoids  of  periods  of  7,  5  and  3  days,  and  24,  12,  6,  3  and  1.5  hours,  as  well  as  a  constant 
function  (Lakhina  et  ah,  2004a).  For  each  OD  flow,  we  hud  the  Fourier  coefficients  from 
the  flow’s  projection  onto  this  basis.  We  next  remove  the  portion  of  the  traffic  modeled 
by  this  Fourier  forecaster  and  model  the  remaining  residual  traffic  via  two  processes.  One 
is  a  noise  process  modeled  by  a  zero-mean  Gaussian  to  capture  short-term  benign  traffic 
variance.  The  second  process  models  volume  anomalies  as  being  exponentially  distributed. 
Anomalies  existing  in  the  Abilene  data  result  in  the  necessity  of  such  a  model. 

In  step  2  we  select  which  of  the  two  noise  processes  is  used  at  each  time  interval.  After 
computing  our  model’s  residuals  (the  difference  between  the  observed  and  traffic  predicted 
by  the  sinusoidal  model)  we  note  the  smallest  negative  residual  value  —m.  We  assume  that 
residuals  in  the  interval  [— m,  m]  correspond  to  benign  traffic  and  that  residuals  exceeding 
m  correspond  to  traffic  anomalies.  We  separate  benign  variation  and  anomalies  in  this 
way  since  these  effects  behave  quite  differently.  (This  is  an  approximation  but  it  works 
reasonably  well  for  most  OD  flows.)  Negative  residual  traffic  reflects  benign  variance,  and 
since  we  assume  that  benign  residuals  have  a  zero-mean  distribution,  it  follows  that  such 
residuals  should  lie  within  the  interval  [— m,m].  Upon  classifying  residual  traffic  as  benign 
or  anomalous  we  then  model  anomaly  arrival  times  as  a  Bernoulli  arrival  process.  Under  this 
model  the  inter-anomaly  arrival  times  become  geometrically  distributed.  Since  we  consider 
only  spatial  PGA  methods,  the  placement  of  anomalies  is  of  secondary  importance. 

For  the  hnal  step,  the  parameters  for  the  two  residual  traffic  volume  and  the  inter¬ 
anomaly  arrival  processes  are  inferred  from  the  residual  traffic  using  the  Maximum  Likeli¬ 
hood  estimates  of  the  Gaussian’s  variance  and  exponential  and  geometric  rates  respectively. 

We  include  goodness-of-£t  results  for  four  OD  flows:  flow  144  which  maximizes  mean  and 
variance  among  all  144  flows;  flow  113  which  has  one  of  the  smallest  means  and  variances 
among  all  144  flows;  and  flows  15  and  75  which  have  median  mean  and  variance,  respectively, 
among  all  144  flows.  After  manual  inspection  on  all  flows  we  believe  these  flows  to  be 
representative  elephant,  mouse  and  two  mid- level  flows,  respectively.  Figures  2.10-2.14 


include  evaluations  of  the  £t  of  the  Gaussian,  Exponential  and  Geometric  distributions  to 
the  three  processes  via  quantile-quantile  plots.  In  general  the  Gaussian  and  Exponential 
Q-Q  plots  for  the  traffic  volume  processes  are  close  to  linear  illustrating  good  hts.  The  Q-Q 
plots  for  the  Geometric  inter-anomaly  arrival  times,  in  Figure  2.14  shows  more  variable 
results.  However  we  consider  only  spatial  PGA  methods  in  this  work  so  the  placement  of 
anomalies  is  of  secondary  importance.  For  each  of  the  four  flows,  we  also  plot  the  time  series 
for  a  week  of  both  the  Abilene  data  and  our  simulated  model.  These  results  establish  the 
suitability  of  this  model  for  the  purpose  of  evaluating  the  Boiling  Frog  attack. 
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Normal  Q-Q  Plot  on  normal'  Residuals  in  Flow  144 
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Figure  2.10:  For  flow  144:  (top)  Gaussian 
Q-Q  plot  of  normal  residnals;  (middle)  ex¬ 
ponential  Q-Q  plot  of  anomalous  residuals; 
(bottom)  simulated  time  series  in  gray. 


Normal  Q-Q  Plot  on  normal'  Residuals  in  Flow  75 
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Exponential  Q-Q  Plot  on  anomalous'  Residuals  in  Flow  75 
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Fignre  2.11:  For  flow  75:  (top)  Gaussian 
Q-Q  plot  of  normal  residnals;  (middle)  ex¬ 
ponential  Q-Q  plot  of  anomalous  residuals; 
(bottom)  simulated  time  series  in  gray. 
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Normal  Q-Q  Plot  on  normal'  Residuals  in  Flow  15 
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Exponential  Q-Q  Plot  on  anomalous'  Residuals  in  Flow  15 
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Figure  2.12:  For  flow  15:  (top)  Gaussian 
Q-Q  plot  of  normal  residnals;  (middle)  ex¬ 
ponential  Q-Q  plot  of  anomalous  residuals; 
(bottom)  simulated  time  series  in  gray. 


Normal  Q-Q  Plot  on  normal'  Residuals  in  Flow  113 
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Exponential  Q-Q  Plot  on  anomalous'  Residuals  in  Flow  113 
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Fignre  2.13:  For  flow  113:  (top)  Gaussian 
Q-Q  plot  of  normal  residnals;  (middle)  ex¬ 
ponential  Q-Q  plot  of  anomalous  residuals; 
(bottom)  simulated  time  series  in  gray. 
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Geometric  Q-Q  Plot  on  anomalous'  arrivals  in  Flow  15  Geometric  Q-Q  Plot  on  anomalous'  arrivals  in  Flow  113 
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Figure  2.14:  The  geometric  Q-Q  plot  of  the  inter-arrival  times  for  flows  144  (top  left),  75 
(top  right),  15  (bottom  left),  and  113  (bottom  right). 


In  our  simulations,  we  constrain  all  link  volumes  to  respect  the  link  capacities  in  the 
Abilene  network;  lOgbps  for  all  but  one  link  that  operates  at  one  fourth  of  this  rate.  We 
cap  chaff  that  would  cause  traffic  to  exceed  the  link  capacities. 

2.3.5  Poisoning  Effectiveness 

We  now  present  the  results  of  the  aforementioned  experiments  for  evaluating  the  poi¬ 
soning  strategies  on  PCA-based  detection. 

2.3. 5.1  Single- Training  Period  Poisoning:  Attacker  Capabilities  vs.  Success 

We  begin  by  measuring  the  evasive  success  of  our  poisoning  strategies,  paying  special 
attention  to  the  effect  of  adversarial  information  and  control.  We  then  proceed  to  explore 
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Single  Poisoning  Period:  Evading  PCA 


Single  Poisoning  Period:  ROC  Curves 
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Figure  2.15:  Success  of  evading  PCA  un-  Figure  2.16:  ROC  curves  of  PCA  under 
der  Single-  Training  Period  poisoning  at-  Single-  Training  Period  poisoning  attacks, 
tacks  using  3  chaff  methods. 


the  overall  performance  of  PCA-based  detection  when  trained  on  poisoned  data. 


Measuring  Evasive  Success.  We  evaluate  the  effectiveness  of  our  three  data  poisoning 
schemes  in  Single- Training  Period  attacks.  During  the  testing  week,  the  attacker  launches 
a  DoS  attack  in  each  5  minute  time  window.  The  results  of  these  attacks  are  displayed  in 
Figure  |2.15[  Although  our  poisoning  schemes  focus  on  adding  variance,  the  mean  traffic  of 


the  OD  ffow  being  poisoned  increases  as  well,  increasing  the  means  of  all  links  over  which 
the  OD  flow  traverses.  The  x-axis  in  Figure  2.15  displays  the  relative  increase  in  the  mean 
rate.  We  average  over  all  experiments  (he.,  over  all  OD  flows).  Representative  numerical 
results  are  summarized  in  Table  12.21 

As  expected  the  increase  in  evasion  success  is  smallest  for  the  uninformed  strategy, 
intermediate  for  the  locally-informed  scheme,  and  largest  for  the  globally-informed  poisoning 
scheme.  The  more  adversarial  control,  the  more  effective  the  poisoning  attack.  A  locally- 
informed  attacker  can  use  the  Add-More-If-Bigger  scheme  to  raise  his  evasion  success  to 
28%  from  the  baseline  FNR  of  3.67%  via  a  10%  average  increase  in  the  mean  link  rates 
due  to  chaff.  Although  28%  may  not  be  viewed  as  a  high  likelihood  of  evasion,  the  attacker 
success  rate  is  nearly  8  times  larger  than  the  unpoisoned  PCA  model’s  rate.  This  number 
represents  an  average  over  attacks  launched  in  each  5  minute  window,  so  the  attacker  could 
simply  retry  multiple  times.  With  our  Globally-Informed  with  a  10%  average  increase  in 
the  mean  link  rates,  the  unpoisoned  FNR  is  raised  by  a  factor  of  10  to  38%  and  eventually 
to  over  90%. 
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Poisoning  scheme 

Type 

FNR  (5%) 

FNR  (10%) 

Random 

Add-More-If-Bigger 

Globally-Informed 

Uninformed 

Locally-informed 

Globally-informed 

5.21%  (xl.4) 
9.98%  (x2.7) 
13.36%  (x3.6) 

20.28%  (x5.5) 
28.33%  (x7.7) 
37.79%  (xlO.3) 

Table  2.2:  Single- Training  Period  attacks  using  the  three  poisoning  schemes.  Test  FNRs  are 
given  for  chaff  that  increases  attacked  link  volumes  by  5%  and  10%.  These  results  correspond 
to  the  curves  in  Fig.  2.15  at  1.05  and  1.1.  Alongside  each  FNR  is  the  multiplicative  increase 
to  the  baseline  FNR  of  3.67%. 


The  big  difference  between  the  performance  of  the  locally-informed  and  globally-informed 
attacker  is  intuitive  to  understand.  Recall  that  the  globally-informed  attacker  knows  a  great 
deal  more  (traffic  on  all  links,  and  future  traffic  levels)  than  the  locally-informed  one  (who 
only  knows  the  traffic  status  of  a  single  ingress  link).  We  consider  the  locally-informed 
adversary  to  have  succeeded  guite  well  with  only  a  small  view  of  the  network.  An  adversary 
is  unlikely  to  be  able  to  acquire,  in  practice,  the  capabilities  used  in  the  globally-informed 
poisoning  attack.  Moreover,  adding  30%  chaff,  in  order  to  obtain  a  90%  evasion  success  is 
dangerous  in  that  the  poisoning  activity  itself  is  likely  to  be  detected.  Therefore  Add-More- 
If-Bigger  presents  a  nice  trade-off,  from  the  adversary’s  point  of  view,  in  terms  of  poisoning 
effectiveness,  and  attacker  capabilities  and  risks.  We  therefore  use  Add-More-If-Bigger,  the 
locally-informed  strategy,  for  many  of  the  remaining  experiments. 


Measuring  Overall  Detector  Performance  Under  Poisoning.  We  evaluate  the  PGA 


detection  algorithm  on  both  anomalous  and  normal  data,  as  described  in  Section  2. 3. 4. 2 


2.16 


producing  the  Receiver  Operating  Characteristic  (ROC)  curves  displayed  in  Figure 
We  produce  an  ROC  curve  (as  shown)  by  hrst  training  a  PCA  model  on  the  unpoisonec 
data  from  week  20.  We  next  evaluate  the  algorithm  when  trained  on  data  poisoned  by 
Add-More-If-Bigger. 

To  validate  PCA-based  detection  on  poisoned  training  data,  we  poison  exactly  one  flow 
at  a  time  as  dictated  by  the  threat  model.  Thus,  for  relative  chaff  volumes  ranging  from 
5%  to  50%,  Add-More-If-Bigger  chaff  is  added  to  each  flow  separately  to  construct  144 
separate  training  sets  and  144  corresponding  ROC  curves  for  the  given  level  of  poisoning. 
The  poisoned  curves  in  Fig.  2.16  display  the  averages  of  these  ROC  curves  (he.,  the  average 
TPR  over  the  144  flows  for  each  FPR). 

We  see  that  the  poisoning  scheme  can  throw  off  the  balance  between  false  positives  and 
false  negatives  of  the  PCA  detector:  The  detection  and  false  alarm  rates  drop  together 
rapidly  as  the  level  of  chaff  is  increased.  At  10%  relative  chaff  volume  performance  degrades 
signihcantly  from  the  ideal  ROC  curve  (lines  from  (0,  0)  to  (0, 1)  to  (1, 1))  and  at  20%  the 
PCA’s  mean  ROC  curve  is  already  close  to  that  of  blind  randomized  prediction  (the  y  =  x 
line  with  0.5  AUC).  Poisoning  its  training  data  dramatically  reduces  the  overall  efficacy  of 
the  PCA-based  detector. 
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Boiling  Frog  Poisoning:  Evading  PCA 


Boiling  Frog  Poisoning:  PCA  Rejections 
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Figure  2.17:  Evasion  success  of  PCA  under 
Boiling  Frog  poisoning  attacks. 


Figure  2.18:  Chaff  rejection  rates 
under  poisoning  attacks  shown 


ure 


2.17 


of  PCA 
in  Fig- 


2.3.5.2  Multi- Training  Period  Poisoning 


We  now  evaluate  the  effectiveness  of  the  Boiling  Frog  strategy,  that  contaminates  the 
training  data  over  multiple  training  periods.  In  Figure  2.17  we  plot  the  FNRs  against  the 
poisoning  duration  for  the  PCA  detector.  We  examine  four  different  poisoning  schedules 
with  growth  rates  g  as  1.01,  1.02,  1.05  and  1.15  respectively.  The  goal  of  the  schedule  is  to 
increase  the  attacked  links’  average  traffic  by  a  factor  of  g  from  week  to  week.  The  attack 


strength  parameter  9  (see  Section  2.3.2)  is  chosen  to  achieve  this  goal.  We  see  that  the  FNR 
dramatically  increases  for  all  four  schedules  as  the  poison  duration  increases.  With  a  15% 
growth  rate  the  FNR  is  increased  to  more  than  70%  from  3.67%  over  3  weeks  of  poisoning; 
even  with  a  5%  growth  rate  the  FNR  is  increased  to  50%  over  3  weeks.  Thus  Boiling  Frog 
attacks  are  effective  even  when  the  amount  of  poisoned  data  increases  rather  slowly. 

Recall  that  the  detector  is  retrained  every  week  using  the  data  collected  from  the  previous 
week.  However,  the  data  from  the  previous  week  is  first  filtered  by  the  detector  itself.  At 
any  time  point  flagged  as  anomalous,  the  training  data  is  thrown  out.  F igure |2 . 1 8 1 shows  the 
proportion  of  chaff  rejected  each  week  by  PCA — chaff  rejection  rate — for  the  Boiling  Frog 
strategy.  The  three  slower  schedules  enjoy  a  relatively  small  constant  rejection  rate  close 
to  5%.  The  15%  schedule  begins  with  a  relatively  high  rejection  rate,  but  after  a  month 
sufficient  amounts  of  poisoned  traffic  mis-train  PCA  after  which  point  the  rates  drop  to  the 
level  of  the  slower  schedules.  We  conclude  that  the  Boiling  Frog  strategy  with  a  moderate 
growth  rate  of  2-5%  can  significantly  poison  FCA,  dramatically  increasing  its  FNR  while 
still  going  unnoticed  by  the  detector. 


53 


Single  Poisoning  Period:  Evading  ANTIDOTE 
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Figure  2.19:  Evasion  success  of  antidote 
under  Single-Training  Period  poisoning  at¬ 
tacks  using  3  chaff  methods. 


Figure  2.20:  ROC  curves  of  antidote  vs. 
PCA  under  Single- Training  Period  poison¬ 
ing  attacks. 


By  comparing  Figures  2.15|and |2.17|  we  observe  that  in  order  to  raise  the  FNR  to  50%, 
an  increase  in  mean  traffic  of  roughly  18%  for  the  Single-Training  Period  attack  is  needed, 
whereas  in  the  Boiling  Frog  attack  the  same  thing  can  be  achieved  with  only  a  5%  average 
traffic  increase  spread  across  3  weeks.  The  Boiling  Frog  attack  is  much  more  stealthy  than 
the  Single- Training  Period  attack. 


2.3.6  Defense  Performance 

We  now  assess  how  antidote  performs  in  the  face  of  two  types  of  poisoning  attacks, 
one  that  lasts  a  single  training  period,  and  one  that  lasts  for  multiple  training  periods. 
For  the  longer  time  horizon,  we  use  the  Add-More-If-Bigger  poisoning  scheme  to  select 
how  much  chaff  to  add  at  each  point  in  time.  We  compare  its  performance  to  the  original 
PCA-subspace  method. 


2. 3. 6.1  Single- Training  Period  Poisoning 


Measuring  Evasive  Success.  In  Figure  [2T^  we  illustrate  antidote’s  FNR  for  various 
levels  of  average  poisoning  that  occur  in  a  Single-Training  Period  attack.  We  can  compare 
this  to  Figure  2.15  that  shows  the  same  metric  for  the  original  PCA  solution.  We  see  here 
that  the  evasion  success  of  the  attack  is  dramatically  reduced.  For  any  particular  level  of 
chaff,  the  evasion  success  rate  is  approximately  cut  in  half.  Interestingly,  the  most  effective 
poisoning  scheme  on  PCA,  Globally-Informed,  is  the  most  ineffective  poisoning  scheme  in 
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the  face  of  our  robust  PCA  solution.  We  believe  the  reason  for  this  is  that  our  Globally- 
Informed  scheme  was  designed  to  specihcally  circumvent  PCA.  Now  that  the  detector  has 
changed,  Globally -Informed  is  no  longer  optimized  for  the  active  detector.  For  the  new 
detector,  Random  remains  equally  effective  because  constant  shifts  in  a  large  subset  of  the 
data  create  a  bimodality  that  is  difficult  for  any  subspace  method  to  reconcile.  This  effect 
is  still  muted  compared  to  the  dramatic  success  of  locally-informed  methods  on  the  original 
detector.  Further,  constant  shift  poisoning  creates  unnatural  traffic  patterns  that  we  believe 
can  be  detected.  Given  this  evidence.  We  conclude  that  antidote  is  an  effective  defense 
against  realistic  poisoning  attacks. 


Measuring  Overall  Detector  Performance  Under  Poisoning.  Since  poisoning  ac¬ 
tivities  distort  a  detector,  it  will  affect  not  only  the  FNRs  but  also  the  false  positives.  To 
explore  this  trade-off,  we  use  ROC  curves  in  Figure  |2.20  for  both  antidote  and  PCA. 
For  comparison  purposes,  we  include  cases  when  the  training  data  is  both  unpoisoned  and 
poisoned.  For  the  poisoned  training  scenario,  each  point  on  the  curve  is  the  average  over 
144  poisoning  scenarios  in  which  the  training  data  is  poisoned  along  one  of  the  144  possi¬ 
ble  flows.  While  antidote  performs  very  similarly  to  PCA  on  unpoisoned  training  data, 
PCA  signihcantly  under-performs  ANTIDOTE  in  the  presence  of  poisoning.  With  a  moderate 
mean  chaff  volume  of  10%,  antidote’s  average  ROC  curve  remains  almost  unchanged  while 
PCA’s  curve  collapses  towards  the  y  =  x  curve  of  the  blind  random  detector.  This  means 
that  the  normal  balance  between  FNRs  and  false  positives  is  completely  thrown  off  with 
PCA;  however  ANTIDOTE  continues  to  retain  a  good  operating  point  for  these  two  common 
performance  measures.  In  summary,  when  we  consider  the  two  performance  measures  of 
FNRs  and  FPRs,  we  give  up  insignificant  performance  shifts  when  using  antidote  when 
no  poisoning  events  occur,  yet  we  see  enormous  performance  gains  for  both  metrics  when 
poisoning  attacks  do  occur. 

Given  Figures  2.19  and  2.20  alone,  it  is  conceivable  that  antidote  outperforms  PCA 
only  on  average,  and  not  on  all  flows  that  could  be  targeted  for  poisoning.  In  place  of 
plotting  all  144  poisoned  ROC  curves,  we  display  the  areas  under  these  curves  (AUC)  for 
the  two  detection  methods  in  Figure  2.21  under  10%  chaff  targeting  each  of  the  144  flows 
individually.  Not  only  is  average  performance  much  better  for  robust  PCA,  but  it  enjoys 
better  performance  for  more  flows  and  by  a  large  amount.  We  note  that  although  PCA 
performs  slightly  better  for  some  flows,  we  see  that  in  fact  both  methods  have  excellent  de¬ 
tection  performance  (because  their  AUCs  are  close  to  1),  and  hence  the  distinction  between 
the  two  is  insignihcant,  for  those  specihc  flows.  In  summary  antidote  enjoys  significantly 
superior  performance  for  the  majority  of  poisoned  flows,  while  PGA ’s  performance  is  only 
ever  superior  by  a  small  margin. 

Figure  2.22  plots  the  mean  AUC  (averaged  from  the  144  ROC  curves’  AUCs  where  flows 
are  poisoned  separately)  achieved  by  the  detectors,  as  the  level  of  chaff  is  intensihed.  Notice 
that  ANTIDOTE  behaves  similarly  to  PCA  under  zero-chaff  conditions,  yet  its  performance 
quickly  becomes  superior  as  the  amount  of  contamination  grows.  In  fact,  it  does  not  take 
much  poisoning  for  ANTIDOTE  to  exhibit  much  stronger  performance.  With  PCA’s  perfor¬ 
mance  drop,  it  starts  approaching  a  random  detector  (equivalent  to  0.5  AUC),  for  amounts 
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Single  Poisoning  Period:  Flows'  AUCs  at  10%  Chaff 


PCA  AUCs 
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Figure  2.21:  The  144  AUCs  from  the  poi-  Figure  2.22:  The  mean  AUCs  versus  mean 

soned  ROC  curves  for  each  possible  target  chaff  levels  for  antidote  and  PCA. 

flow  and  their  mean. 


of  chaff  exceeding  20%. 

In  these  last  few  hgures,  we  have  seen  the  FNR  and  FPR  performance  as  it  varies  across 
flows  and  quantity  of  poisoning.  In  all  cases,  it  is  clear  that  antidote  is  an  effective  defense 
and  dramatically  outperforms  a  solution  that  was  not  designed  to  be  robust.  We  believe  this 
evidence  indicates  that  the  robust  techniques  are  a  promising  avenue  for  SML  algorithms 
used  for  security  applications. 


2. 3. 6. 2  Multi-Training  Period  Poisoning 


We  now  evaluate  the  effectiveness  of  antidote  against  the  Boiling  Frog  strategy,  that 


occurs  over  multiple  successive  training  periods.  In  Figure  2.23  we  see  the  FNRs  for  an¬ 
tidote  with  the  four  different  poisoning  schedules.  We  observe  two  interesting  behaviors. 
First,  for  the  two  most  stealthy  poisoning  strategies  (1.01  and  1.02),  antidote  shows  re¬ 
markable  resistance  in  that  the  evasion  success  increases  very  slowly,  e.g.,  after  10  training 
periods  it  is  still  below  20%.  This  is  in  stark  contrast  to  PCA  (see  Figure  2.17)  in  which. 


for  example,  after  10  weeks,  the  evasion  success  is  over  50%  for  the  1.02  poisoning  growth 
rate  scenario.  Second,  under  PCA  the  evasion  success  keeps  rising  over  time.  However  with 
ANTIDOTE  under  the  heavier  poisoning  strategies,  we  see  that  the  evasion  success  actually 
starts  to  decrease  after  some  time.  The  reason  for  this  is  that  antidote  has  started  reject¬ 
ing  so  much  of  the  training  data,  that  the  poisoning  strategy  starts  to  lose  its  effectiveness. 

To  look  more  closely  at  this  behavior  we  show  the  proportion  of  chaff  rejected  by  anti¬ 
dote  under  multi-training  period  poisoning  episodes  in  Figure  |2.24  We  see  that  the  two 
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Boiling  Frog  Poisoning:  Evading  ANTIDOTE 


Boiling  Frog  Poisoning:  ANTIDOTE  Rejections 


Attack  duration  (weeks) 


Week 


Figure  2.23:  Evasion  success  of  antidote  Figure  2.24:  Chaff  rejection  rates  of  anti- 
under  Boiling  Frog  poisoning  attacks.  dote  under  Boiling  Frog  poisoning  attacks. 


slower  schedules  almost  have  a  constant  rejection  rate  close  to  9%,  which  is  higher  than 
that  of  original  PCA  (which  is  close  to  5%).  For  the  faster  poisoning  growth  schedules  (5% 
and  15%)  we  observe  that  antidote  rejects  an  increasing  amount  of  the  poison  data.  This 
reflects  a  good  target  behavior  for  any  robust  detector:  to  reject  more  training  data  as  the 
contamination  grows.  From  these  hgures  we  conclude  that  the  combination  of  technigues  we 
use  in  antidote,  namely  a  PCA-hased  detector  designed  with  robust  dispersion  goals  com¬ 
bined  with  a  Laplace-based  cutoff  threshold,  is  very  effective  at  maintaining  a  good  balance 
between  false  negative  and  false  positive  rates  throughout  a  variety  of  poisoning  scenarios 
(different  amounts  of  poisoning,  on  different  OD  flows,  and  on  different  time  horizons). 

2.4  Summary 

In  this  chapter  we  investigate  two  large  case-studies  on  Causative  attacks  on  Statisti¬ 
cal  Machine  Learning  systems — attacks  in  which  the  adversary  manipulates  the  learner  by 
poisoning  its  training  data. 

In  the  first  case-study  we  show  that  an  adversary  can  effectively  disable  the  SpamBayes 
email  spam  filter,  by  increasing  its  False  Positive  Rate  (an  Availability  attack),  with  rela¬ 
tively  little  system  state  information  and  relatively  limited  control  over  the  training  data. 
The  Usenet  dictionary  attack  causes  misclassihcation  of  36%  of  legitimate  ham  messages 
with  only  1%  control  over  the  training  messages,  rendering  SpamBayes  unusable.  Our  fo¬ 
cused  attack  changes  the  classihcation  of  a  target  legitimate  message  60%  of  the  time  with 
knowledge  of  only  30%  of  the  target  message’s  tokens.  We  also  explore  two  successful 
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defenses  for  SpamBayes.  The  RONI  defense  filters  ont  dictionary  attack  messages  with 
complete  snccess.  The  dynamic  threshold  defense  also  mitigates  the  effect  of  the  dictionary 
attacks.  Focnsed  attacks  are  especially  difficnlt  to  defend  against  becanse  of  the  attacker’s 
extra  knowledge;  developing  effective  defenses  in  the  targeted  case  is  an  important  open 
problem. 

In  the  second  case-stndy  we  consider  an  adversary  that  manipnlates  PCA-based  network¬ 
wide  volnme  anomaly  detection  for  the  pnrposes  of  evading  detection  at  test  time  (an  In¬ 
tegrity  attack).  We  stndy  the  effects  of  mnltiple  poisoning  strategies  while  varying  the 
amonnt  of  information  available  to  the  attacker  and  the  time  horizon  over  which  the  poi¬ 
soning  occnrs.  We  demonstrate  that  the  PCA-snbspace  method  can  be  easily  compromised 
(often  dramatically)  nnder  all  of  the  considered  poisoning  scenarios.  From  the  attacker’s 
point  of  view,  we  illnstrate  that  simple  strategies  can  be  effective  and  conclnde  that  it 
is  not  worth  the  risk  or  extra  amonnt  of  work  for  the  attacker  to  engage  in  attempts  at 
near-optimal  globally-informed  strategies.  For  example,  when  a  locally-informed  attacker 
increases  the  average  volnme  on  a  flow’s  links  by  10%,  the  False  Negative  Rate  (or  chance  of 
evasion)  is  increased  by  a  factor  of  7.  Moreover,  with  stealthy  poisoning  strategies  execnted 
over  longer  time  periods,  an  attacker  can  increase  the  FNRs  to  over  50%  with  less  data 
than  poisoning  schemes  carried  ont  dnring  a  short  time  window.  We  demonstrate  that  onr 
ANTIDOTE  connter-measnre  based  on  Robnst  Statistics  is  robnst  to  these  attacks  in  that  it 
does  not  allow  poisoning  attacks  to  shift  the  false  positive  and  false  negative  rates  in  any 
signihcant  way.  We  show  that  antidote  provides  robnstness  for  nearly  all  the  ingress  PoP 
to  egress  PoP  flows  in  a  backbone  network,  rejects  mnch  of  the  contaminated  data,  and 
continnes  to  operate  as  a  DoS  defense  even  in  the  face  of  poisoning  by  variance  injection 
attacks. 

A  common  theme  of  the  two  case-stndies  on  Cansative  attacks,  is  the  important  role  of 
adversarial  information  and  control.  In  the  hrst  case-stndy  on  email  spam  hltering,  informa¬ 
tion  corresponds  to  approximate  knowledge  of  the  victim’s  token  distribntion  and  control 
is  parameterized  by  the  fraction  of  the  training  corpns  poisoned  by  the  attack  and  the  size 
of  the  poison  spam  messages.  In  the  second  case-stndy  on  network-wide  volnme  anomaly 
detection,  information  corresponds  to  the  ability  to  monitor  traffic  on  one  or  mnltiple  links, 
while  control  is  most  natnrally  exerted  in  the  volnme  of  chaff  added  to  the  network.  Inter¬ 
estingly  the  forms  of  information  and  control  that  seem  most  natnral  to  these  two  domains 
are  very  different.  By  contrast,  however,  we  show  that  in  both  stndies  increased  information 
or  increased  control  resnlt  in  more  effective  attacks.  We  also  observe  that  attack  efficacy  is 
not  necessarily  ‘linear’  in  adversarial  capability:  e.g.,  locally-informed  poisoning  of  PCA- 
based  detection  at  np  to  moderate  levels  of  control  are  jnst  as  effective  as  globally-informed 
poisoning.  We  retnrn  to  several  of  these  observations  in  Chapter 
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Chapter  3 

Querying  for  Evasion 


Beware  the  wolf  in  sheep ’s  clothing. 

-  Aesop 


In  this  chapter  we  consider  attacks  on  trained  classihers  that  systematically  snbmit 
queries  to  a  classiher  with  the  goal  of  hnding  an  instance  that  evades  detection  while  being 
of  a  near-minimal  distance  to  a  target  malicious  instance.  According  to  the  taxonomy  of 


Barreno  et  ah  (2006)  discussed  in  Section  1.2.2,  these  attacks  are  Exploratory  attacks  as 


they  interact  with  a  learned  model  at  test-time;  and  while  the  attacks  are  most  naturally 
applied  as  Integrity  attacks  (those  that  cause  False  Negatives),  they  are  equally  suited  to 
Availability  attacks  (those  that  aim  for  False  Positives). 


We  adopt  the  abstract  theoretical  framework  of  Lowd  and  Meek  (2005b),  and  extend 


their  results  for  evading  linear  classihers  to  evading  classihers  that  partition  feature  space 
into  two  classes,  one  of  which  is  convex.  In  addition  to  the  primary  goal  of  hnding  a  distance¬ 
minimizing  negative  instance,  we  adopt  the  secondary  goal  of  low  query  complexity  (like 


Lowd  and  Meek  2005b).  A  corollary  of  our  theoretical  results  is  that  in  general  evasion  can 


be  signihcantly  easier  than  reverse  engineering  the  decision  boundary  which  is  the  approach 


originally  taken  by  Lowd  and  Meek  (2005b) 


The  research  presented  in  this  chapter  was  joint  work  with  UCB  EECS  doctoral  candi¬ 
date  Blaine  Nelson.  During  the  course  of  this  investigation  I  contributed  the  initial  query 
algorithm  for  the  convex  positive  class  case,  its  lower  bound,  an  initial  argument  for  the 
Loo  cost  lower  bound,  and  the  lower  bound’s  extension  to  Lp  costs.  Nelson  led  the  work  on 
improving  the  convex  positive  class  algorithm,  the  Loo  lower  bound,  the  specialized  L2  cost 
lower  bound,  and  developing  the  reduction  for  the  convex  negative  class  case. 


3.1  Introduction 

Machine  learning  is  often  used  to  hlter  or  detect  miscreant  activities  in  a  variety  of  ap¬ 
plications;  e.g.,  spam,  intrusion,  virus,  and  fraud  detection.  All  known  detection  techniques 
have  blind  spots;  i.e.,  classes  of  miscreant  activity  that  fail  to  be  detected.  While  learning 
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allows  the  detection  algorithm  to  adapt  over  time,  constraints  on  the  learning  algorithm  also 
may  allow  an  adversary  to  programmatically  hnd  these  vulnerabilities.  We  consider  how  an 
adversary  can  systematically  discover  blind  spots  by  querying  the  learner  to  hnd  a  low  cost 
instance  that  the  detector  does  not  hlter.  Consider  a  spammer  who  wishes  to  minimally 
modify  a  spam  message  so  that  it  is  not  classihed  as  a  spam  and  instead  reaches  a  user’s 
inbox  unhltered.  By  observing  the  responses  of  the  spam  detector,  for  example  in  a  public 
webmail  service  in  which  he  can  open  accounts  and  send  himself  messages,  the  spammer 
can  search  for  a  successful  modihcation  while  using  few  queries. 

The  evasion  problem  of  hnding  a  low  cost  negative  instance  with  few  queries  was  hrst 


posed  by  Lowd  and  Meek  (2005b ).  We  continue  their  line  of  research  by  generalizing  it  to  the 
family  of  convex-inducing  classihers — classihers  that  partition  feature  space  into  two  sets  one 
of  which  is  convex.  Convex-inducing  classihers  are  a  natural  family  to  examine  that  include 
linear  classihers,  neural  networks  with  a  single  hidden  layer  (convex  polytopes),  one-class 
classihers  that  predict  anomalies  by  thresholding  the  log-likelihood  of  a  log-concave  (or  uni- 
modal)  density  function,  the  one-class  SVM  with  linear  kernel,  and  quadratic  classihers  of 
the  form  x^Ax-|-b^x-|-c  >  0  for  semidehnite  A.  The  convex-inducing  classihers  also  include 
classihers  whose  support  is  the  intersection  of  a  countable  number  of  halfspaces,  cones,  or 
balls.  We  also  consider  more  general  costs  than  the  weighted  Li  costs  considered  by 


Lowd  and  Meek  (2005b). 


We  show  that  evasion  does  not  require  reverse  engineering  the  classiher — querying  the 
classiher  to  learn  its  decision  boundary.  The  algorithm  of  Lowd  and  Meek  (2005b)  for 


evading  linear  classihers  reverse-engineers  the  classiher’s  decision  boundary  but  is  still  ef- 
hcient.  Our  algorithms  for  evading  convex- inducing  classihers  do  not  require  fully  esti¬ 


mating  the  boundary  (which  is  hard  in  the  general  case;  see  Rademacher  and  Goyal 


2009)  or  reverse-engineering  the  classiher’s  state.  Instead,  we  directly  search  for  a  mini¬ 


mal  cost-evading  instance.  Our  algorithms  require  only  polynomial-many  queries  to  achieve 
(1  -|-  e)-multiphcative  approximations  of  cost-optimal  negative  instances  in  feature  space 
with  an  algorithm  for  convex  positive  classes  for  Lp  cost  (p  <  1)  solving  the  linear  case 

with  O  ( log  -  -|-  -v/ log  -D  )  queries  which  is  fewer  than  the  previously-published  reverse¬ 


engineering  technique.  A  new  lower  bound  of  O  (log  ^  shows  that  this  complexity  is 

close  to  optimal.  For  p  >  1  we  show  that  hnding  evading  instances  that  come  very  close  to 
having  optimal  Lp  cost  requires  exponential  query  complexity. 

Our  geometric  random  walk-based  approach  for  evading  classihers  with  convex  negative 
classes  while  minimizing  Lp  costs  (p  >  1)  has  query  complexity  O*  (D^log^).  A  conse¬ 
quence  of  polynomial  complexity  for  convex-inducing  classihers  is  that  in  general,  evasion 
can  be  signihcantly  easier  than  reverse  engineering  the  decision  boundary. 


Chapter  Organization.  We  conclude  this  introductory  section  with  a  brief  summary  of 


related  work.  Section  3.2  overviews  the  abstract  framework  of  Lowd  and  Meek  (2005b)  upon 


which  we  build  and  covers  preliminaries  for  Section  3^,  which  develops  and  analyzes  query 
algorithms  for  evading  convex-inducing  classihers  while  minimizing  Li  cost.  Section  3.4 


considers  evasion  for  minimizing  general  Lp  costs.  We  conclude  the  chapter  with  a  summary 
of  our  key  contributions. 
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3.1.1  Related  Work 


first  explored  the  evasion  problem,  and  developed  a  method 
that  reverse-engineered  linear  classihers.  Our  approach  generalizes  their  result  and  improves 
upon  it  in  three  signihcant  ways. 

•  We  consider  a  more  general  family  of  classihers:  the  family  of  convex-inducing  clas¬ 
sihers  that  partition  feature  space  into  two  sets  one  of  which  is  convex.  This  family 
includes  the  family  of  linear  classihers  as  a  special  case. 


Lowd  and  Meek  (2005b 


Our  approach  does  not  fully  estimate  the  classiher’s  decision  boundary  (which  is  gen¬ 
erally  hard  Rademacher  and  Goyal  (2009))  or  reverse-engineer  the  classiher’s  state; 
instead,  we  directly  search  for  an  instance  that  the  classiher  recognizes  as  negative 
that  is  close  to  the  desired  attack  instance  (an  evading  instance  of  near-minimal  cost). 


•  Although  our  algorithms  successfully  evade  a  more  general  family  of  classihers,  our 
algorithms  still  only  use  a  limited  number  of  queries:  they  require  only  a  number  of 
queries  polynomial  in  the  dimension  of  the  instance  space.  Moreover,  our  A'-step 
MultiLineSearch  Algorithm  solves  the  linear  case  with  fewer  queries  than  the 
previously-published  reverse-engineering  technique. 


Learning  the  decision  boundary  by  submitting  membership  queries  requires  exponential 
numbers  of  queries  for  general  convex-inducing  classihers,  since  estimating  volumes  of  convex 
bodies  (known  to  be  NP-hard;  Dyer  and  Frieze  1992  Rademacher  and  Goyal  2009)  reduces 
to  learning  the  boundary.  Thus  a  consequence  of  our  algorithms  that  evade  all  convex- 
inducing  classihers  with  polynomial  complexity,  is  that  evasion  is  significantly  easier  than 
reverse  engineering. 


Dalvi  et  ah  (2004)  use  a  cost-sensitive  game  theoretic  approach  to  preemptively  patch 


a  classiher’s  blind  spots.  They  construct  a  modihed  classiher  designed  to  detect  optimally 
modihed  instances.  This  work  is  complementary  to  our  own;  we  examine  optimal  evasion 
strategies  while  they  have  studied  mechanisms  for  adapting  the  classiher.  In  this  work  we 
assume  the  classiher  is  not  adapting  during  evasion. 


A  number  of  authors  have  studied  evading  sequence-based  intrusion  detector  systems  (Tai 


et  al. ,  2002  Wagner  and  Soto,  2002).  In  exploring  mimicry  attacks  these  authors  demon¬ 


strated  that  real  IDSs  could  be  fooled  by  modifying  exploits  to  mimic  normal  behaviors. 
These  authors  used  offline  analysis  of  the  IDSs  to  construct  their  modihcations  whereas  our 
modihcations  are  optimized  by  querying  the  classiher. 

Finally,  there  is  an  entire  held  of  active  learning  that  also  studies  a  form  of  query  based 
optimization;  e.g.,  see  Schohn  and  Gohn  (2000).  While  both  active  learning  and  near- 


optimal  evasion  explore  optimal  querying  strategies,  the  objectives  for  these  two  settings 
are  quite  diherent. 


3.2  Background  and  Definitions 

This  section  is  devoted  to  summarizing  the  background  relevant  to  this  chapter,  including 
the  adversarial  classifier  reverse  engineering  (ACRE)  problem  introduced  by 
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Lowd  and  Meek  (2005b)  which  we  re-cast  as  a  problem  of  evasion,  and  provide  prelimi¬ 


nary  notation  and  dehnitions  important  for  describing  the  main  resnlts  of  the  chapter. 

We  will  nse  email  spam  as  a  rnnning  example.  Section  [3.2.1.1|ennmerates  a  partial  list 


of  applications  of  evasion  algorithms  consistent  with  this  framework. 


Example  9.  Consider  Exploratory  attacks  on  email  spam  filtering:  a  spammer  wishes  to 
send  a  spam  email  message  to  a  victim’s  email  account  that  is  protected  by  a  state-of-the-art 
learning-based  spam  filter.  The  attacker  suspects  that  the  spam  message  is  being  blocked  by 
the  filter,  so  he  must  modify  it  somehow  so  that  it  can  evade  filtering. 


3.2.1  The  Evasion  Problem 

Let  X  =  be  the  feature  space;  each  component  of  an  instance  x  G  df  is  a  feature 
denoted  by  Xd-  Let  6d  =  (0, . . . ,  1, . . . ,  0)  be  the  nnit  vector  parallel  to  the  d^^  coordinate 
axis  (and  sitting  in  the  coordinate’s  positive  halfspace). 

We  consider  a  family  of  classifiers  iF  with  elements  f  E  F  mapping  X  into  the  binary 
response  space  y  =  Onr  attacks  are  designed  to  operate  against  a  static  deter¬ 

ministic  classiher:  the  classiher  has  already  been  £t  to  data  or  is  a  hand-crafted  decision 
rnle,  the  adversary  does  not  know  the  actnal  mapping  a  priori  bnt  does  know  the  family 
F  from  which  it  came.  We  define  the  two  sets  that  partition  X  according  to  the  classiher’s 
decision  rule  as  the  positive  and  negative  classes  X^  =  f~^{'-\-')  and  Xfi  =  /“^('  — ')  respec¬ 
tively;  and  (arbitrarily)  identify  X^  with  a  malicious  class  of  instances.  When  the  classiher 
/  can  be  understood  from  context  (as  is  often  the  case  since  it  is  hxed)  we  drop  explicit 
reference  to  it  and  denote  the  classes  by  X^  and  X~ . 

Example  10.  Consider  the  email  spam  problem  of  Example\^  The  feature  space  for  email 
spam  filtering  is  typically  vectors  in  {0, 1}'^  corresponding  to  a  bag-of-words  model  where 
each  dimension  corresponds  to  a  possible  token  (e.g.,  words,  URLs,  etc.).  The  positive 
(negative)  class  corresponds  to  spam  (ham)  email  messages. 

Assumptions.  We  assume  that  the  feature  space  representation  is  known  to  the  adversary. 
We  assume  that  the  classifier  is  deterministic  and  fixed  {e.g.,  is  designed  manually  without 
learning,  is  trained  offline,  or  is  re-trained  only  periodically).  We  make  the  weak  assumption 
that  the  adversary  has  access  to  instances  x“  G  X~  and  x^  G  X~^.  And  finally  we  assume 
that  the  adversary  has  access  to  a  membership  guery  oracle  for  the  true  classiher  so  that  /  (x) 
may  be  observed  for  any  x  G  A:  there  are  no  restrictions  to  which  points  may  be  queried 
by  the  adversary.  While  these  assumptions  may  not  all  hold  in  all  real-world  settings,  they 
allow  us  to  consider  a  worst-case  adversary. 

Example  11.  Consider  the  email  spam  problem  of  Example\^  If  the  victim’s  account  is 
hosted  by  an  open  membership  webmail  service  such  as  Yahoo!  Mail,  Cmail  or  Hotmail — 
i.e.,  accounts  may  be  opened  by  anyone  for  free  or  relatively  small  cost  (for  example  by 
solving  a  CAPTCHA) — then  the  spammer  can  gain  access  to  the  spam  filter’s  membership 
oracle  by  simply  opening  an  account  with  the  webmail  service.  To  guery  the  oracle,  the 
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spammer  need  only  send  himself  query  messages  and  observe  whether  they  pass  through  the 
filter  to  the  account’s  inbox  or  whether  they  are  filtered.  Finally  spammers  can  easily  find 
emails  x“  G  and  G  . 


Attack  Objective.  We  consider  an  adversary  with  special  interest  in  some  x^  G 
In  security-sensitive  settings  this  instance  typically  contains  some  kind  of  payload  that  the 
attacker  wishes  to  send  to  a  system  guarded  by  the  detector.  Let  A  :  A  — )■  be  a  cost 
function  of  interest  to  the  adversary:  we  think  of  the  cost  as  being  the  distance  to  the  target 
positive  instance  x"^  i.e.,  A(x)  =  (i(x,  x^)  to  model  an  adversary  who  is  willing  to  alter 
x^  to  evade  detection  but  who  is  not  willing  to  make  too  drastic  a  modihcation.  Thus  the 
objective  of  our  attack  is  to  minimize  A  over  X~ . 

We  focus  on  the  class  of  weighted  Lp  cost  functions  for  0  <  p  <  cxd 

/  D  \  Vp 

f  ,  (3.1) 


where  0  <  q  <  cxd  is  the  (relative)  cost  the  adversary  associates  with  changes  to  the 
feature.  Unless  stated  otherwise,  we  understand  the  feature  costs  to  be  identically  one.  As 
with  Lowd  and  Meek  (2005b)  we  focus  primarily  on  weighted  Li  costs  in  Section  3.3  and 


explore  related  Lp  costs  in  Section  3.4  Weighted  Li  costs  are  particularly  appropriate  for 


many  adversarial  problems  since  costs  are  assessed  based  on  the  degree  to  which  a  feature  is 
altered  and  the  adversary  typically  is  interested  in  some  features  more  than  others.  The  next 
example  provides  a  more  specihc  discussion  of  the  cost’s  relevance  in  email  spam  hltering. 


Example  12.  Consider  again  the  email  spam  problem  of  Example\^  The  spammer’s  orig¬ 
inal  goal  was  to  successfully  email  a  spam  message  to  the  victim  without  the  message  being 
filtered.  This  message  contains  some  payload  typically  a  link  (e.g.,  to  an  online  pharmacy,  a 
drive-by-download  site,  etc.)  or  an  attachment  (e.g.,  a  document  infected  by  a  virus).  While 
the  payload  cannot  be  altered,  the  surrounding  message  that  entices  the  user  to  activate  the 
payload  using  social  engineering  can  usually  be  modified  to  some  extent  without  degrading 
the  effectiveness  of  the  enticement  too  much.  Additionally,  spurious  features  may  be  added 
(e.g.,  parts  of  the  message  that  go  unrendered).  The  cost  function  used  should  capture  the 
utility  of  the  altered  message  to  the  adversary.  The  Li  cost  is  particularly  appropriate  as 
for  the  bag-of-words  feature  model,  this  cost  corresponds  to  edit  distance,  a  natural  metric 
for  passages  of  text.  Having  low  Li  cost  corresponds  to  a  message  that  is  actually  similar  to 
the  original  spam  constructed  by  the  spammer. 

Denote  by  T>c  the  closed  ball  in  X  with  center  x^  and  radius  C  with  respect  to  the 
distance  corresponding  to  the  given  cost,  he.,  T>c  is  the  set  of  instances  with  cost  at  most 
C .  Unless  stated  otherwise  the  particular  cost  should  be  apparent  from  the  context. 

define  minimal  adversarial  cost  (MACj  of  a  classiher  /  to  be 


Lowd  and  Meek  (2005b 


the  scalar 


=  inf  A  (x) 

-x.&XJ 


MAC(/,A) 
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They  further  define  an  instance  to  be  an  e-approximate  instance  of  minimal  adversarial  cost 
fe-IMACj  if  it  is  a  negative  instance  having  cost  no  more  than  a  factor  (1  +  e)  times  the 
MAC.  Overloading  notation,  we  define  the  set  of  e-IMACs  to  be 


e-IMAC(/,A)  =  {xG  I  A(x)  <  (1  +  e)  ■MAC(/,A)}  .  (3.2) 

The  adversary’s  goal  is  to  find  an  e-IMAC  efficiently,  by  issning  a  relatively  small  number 
of  queries  as  measured  by  e  and  D.  In  the  email  spam  setting  of  the  running  example,  this 
corresponds  to  finding  a  message  that  will  reach  the  victim’s  inbox  while  being  as  close  to 
a  target  spam  message  as  possible.  We  call  the  overall  problem  the  Evasion  Problem. 


Definition  13.  A  family  of  classifiers  A  is  e-IMAC  searchable  under  a  family  of  cost 
functions  A  if  for  every  f  ^  and  A  G  there  is  an  algorithm  that  finds  an  instance 
in  e-IMAC  (/,  A)  using  polynomially-many  membership  gueries  in  D  and  log(l/e).  We  will 
refer  to  such  an  algorithm  as  efficient. 


In  generalizing  the  results  of  Lowd  and  Meek  (2005b)  we  have  made  minor  alterations 
to  their  corresponding  definition  of  the  evasion  problem. 


Remark  14.  Lowd  and  Meek  (2005b)  introduced  the  concept  of  adversarial  classifier  reverse 
engineering  (ACRE)  learnability  to  guantify  the  difficulty  of  finding  an  e-IMAC  instance  for 
particular  families  of  classifiers  5F  and  adversarial  costs  A.  The  notion  o/ACRE  e-learnable 
is  similar  to  e-IMAC  searchable  however  there  are  some  noteworthy  differences.  Our  notion 
of  efficiency  does  not  take  into  account  the  encoded  size  of  f  for  simplicity  (in  the  linear 
case  considered  by  Lowd  and  Meel^2005b  this  is  simply  D );  similarly  we  do  not  make  explicit 
dependence  on  the  encodings  of  the  known  positive  and  negative  instances  x^,x“  since  these 
are  implicitly  included  via  the  dependence  on  D.  ACRE  learnability  reguires  knowledge  of 
a  third  point  x+  G  A+.  Here  we  take  x+  =  x"^  making  the  attacker  less  covert  since  it  is 
typically  significantly  easier  to  infer  the  attacker’s  intentions  based  on  their  gueries. Einally, 
we  view  the  original  goal  of  ACRE  learnability  as  being  one  of  evasion  and  not  of  reverse 
engineering  (we  discuss  the  related  goal  of  reverse  engineering  in  Section  3.2.2).  As  a 
conseguence  we  have  re-named  the  problem  to  highlight  this  fact. 


3. 2. 1.1  Example  Applications 

In  general,  algorithms  for  the  Evasion  Problem  have  applications  in  attacking  decision 
rnles  in  many  domains. 

Content-Based  Email  Spam  Filtering.  As  discussed  in  the  series  of  rnnning  examples 
starting  with  Example  the  notions  of  Li  cost,  access  to  the  filter’s  membership  query 
oracle  for  webmail  services,  and  the  desire  to  minimize  cost  over  the  negative  class  while 
submitting  few  queries,  are  all  appropriate  for  evading  email  spam  filtering  based  on  message 
content. 
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Web  Spam  Filtering.  In  order  to  game  search  engine  rankings,  it  is  common-place 
for  parties  to  create  spam  web  pages  whose  sole  purpose  is  to  contain  content  matching 
target  search  queries  and  to  link  to  the  target  webpage  to  be  promoted  in  an  effort  to 
increase  PageRank-like  authority  scores  (Gyongyi  and  Garcia-Molina ,  2005).  To  combat 


such  malicious  activities  the  search  company  can  learn  to  detect  such  spam  webpages  (Drost 


and  Scheffer,  2005).  This  creates  an  arms  race  in  which  the  adversaries  are  incentivized  to 


evade  detection  using  methods  such  as  those  discussed  here.  Feedback  is  available  to  the 
adversaries  via  the  effects  their  link  farms  have  on  the  search  engine  results,  and  blacklists 
of  spam  pages. 

Polymorphic  Worm  Detection.  In  order  to  make  detection  of  malicious  packets  difficult 
for  defenders,  attackers  design  polymorphic  worms  that  mutate  their  binary  while  includ¬ 
ing  payload  instructions  that  are  required  to  exploit  a  specific  vulnerability  in  a  system. 


Learning-based  defenses  have  been  designed  which  can  learn  (to  some  extent,  see  Newsome 


et  ah  2006  Venkataraman  et  ah  2008)  to  detect  such  polymorphic  worms  (Kim  and  Karp 


2004).  An  intelligent  polymorphic  worm  could  utilize  evasive  strategies  to  modify  its  code 
in  an  attempt  to  evade  detection.  Gost  corresponds  to  including  as  much  of  the  desired 
payload  as  possible — higher  cost  packets  could  come  from  including  payloads  that  exploit 
less  severe  vulnerabilities;  so  it  is  reasonable  to  model  the  worm  as  wanting  to  minimize 
cost  over  the  class  of  packets  not  filtered  by  a  learned  signature.  Additionally  feedback  may 
be  observed  by  monitoring  acknowledgments,  acknowledgment  timings,  and  transmissions 
from  successfully  infected  systems. 


Network  Anomaly  Detection.  In  the  second  case-study  of  Ghapter  we  investigate 
an  application  of  Principal  Gomponents  Analysis  (PGA)  to  detecting  network-wide  volume 
anomalies.  For  example,  [Lakhina  et  ah  (2004a)  shows  that  PGA  can  be  used  to  detect  DoS 
attacks  that  cause  high  volume  flows  in  top-tier  networks.  In  our  case-study,  we  consider 
Gausative  attacks  on  PGA.  However  the  adversary  may  want  to  evade  detection  at  test 
time.  Gost  may  measure  the  size  of  the  flow  initiated  by  the  attacker,  or  the  similarity  of 
the  path  taken  compared  to  a  desired  path  in  the  network.  Finally,  feedback  to  queries  may 
be  observed  by  monitoring  egress  links  to  the  destination  PoP.  Indeed  our  results  apply  for 
the  case  of  PGA  in  this  setting,  as  the  negative  set  is  modeled  as  the  (convex)  instances 
between  a  pair  of  parallel  hyperplanes. 


3.2. 1.2  Multiplicative  Optimality  and  Binary  Search 


The  objective  function  introduced  in  Equation  (3.2)  is  that  of  multiplicative  optimality. 


The  results  of  this  chapter  are  easily  adapted  for  additive  optimality  in  which  we  seek 
instances  with  cost  no  more  than  p  >  0  greater  than  the  MAG.  We  will  use  the  notation 


e-IMAG*  and  ?7-IMAG’''  to  refer  to  the  set  in  Equation  (3.2)  and  the  analogous  set 
r/-IMAG+  (/,  A)  =  {x  G  A”  |  A  (x)  <  77  +  MAG  (/,  A)}  . 
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In  either  the  multiplicative  or  additive  case,  we  can  organize  the  search  for  a  near-optimal 
instance  by  iterating  over  cost  bounds  on  the  positive  and  negative  classes  using  a  binary 
search  as  follows. 

If  there  is  an  instance  x  G  X~  with  cost  C~  and  if  all  instances  with  cost  no  more  than 
C~^  are  in  then  we  can  conclude  that  C~  and  C~^  bound  the  MAC  i.e.,  MAC  (/,  A)  G 
[C^,C~].  Moreover  x  is  e-multiplicatively  optimal,  trivially,  if  Cq  /Cq  <  1  -|-  e  and  is  rj- 
additively  optimal  if  Cq  —Cq  <  rj.  In  the  sequel,  we  will  consider  algorithms  that  use  binary 
search  to  iteratively  reduce  the  gap  between  iterates  of  C~  and  to  achieve  additive  or 
multiplicative  optimality.  In  particular,  if  a  new  query  point  with  a  given  cost  establishes 
a  new  upper  or  lower  bound  on  MAC,  then  binary  search  strategies  can  reduce  the  gap 
that  is  between  and  Cf.  Given  sufficient  iterations,  optimality  will  be  reached  given  the 
following  criteria. 

Lemma  15.  If  an  algorithm  can  provide  bounds  <  MAC(/,  A)  <  C~ ,  then  this  algo¬ 
rithm  has  achieved 


(1)  {C  —  C^)- additive  optimality;  and 

(2)  -multiplicative  optimality. 

The  measure  of  performance  to  be  optimized  by  search  algorithms  should  correspond 
to  the  gap  between  the  bounds  that  determines  the  level  of  approximation  to  optimality, 
as  given  by  this  lemma.  To  achieve  additive  optimality  we  dehne  the  additive  gap 
to  be  =  Cf  —  Cf  with  corresponding  to  initial  bounds  Cq  and  Cq  .  In  the 
additive  setting  binary  search  provides  for  an  optimal  worst-case  query  complexity  by  using 
a  proposal  step  of  the  arithmetic  mean  Ct  =  [Cf  -\-  Cf~)  /2,  stopping  once  <  rj.  The 
search’s  query  complexity  is 


(3.3) 


Multiplicative  optimality  can  also  be  achieved  via  a  binary  search  over  the  space  of  expo¬ 
nents  as  follows.  Rewriting  the  upper  and  lower  bounds  as  G~  =  2“  and  G^  =  2^,  the 
multiplicative  optimality  condition  becomes  an  additive  condition  a  —  b  <  log2(l  -|-  e).  Bi¬ 
nary  search  on  the  exponent  achieves  e-multiplicative  optimality  with  the  fewest  queries  in 
the  worst-case.  The  t^^  multiplicative  gap  is  G^f  ^  =  Gf/Gf~]  the  search  uses  as  a  proposal 
step  the  geometric  mean  Gt  =  a/ Gf  ■  Gf~  and  stops  once  G[*'^  <  1  +  e;  query  complexity  is 


L 


* 

€ 


log2 


log2  \ 

log2(l  +  e)  j 


(3.4) 


The  search  methods  for  achieving  additive  and  multiplicative  optimality  are  intrinsically 
related,  however  there  are  two  key  differences  which  we  now  detail.  First,  multiplicative 
optimality  is  well-dehned  only  when  Gq  >  0  whereas  additive  optimality  is  possible  for 
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Cq  =  0.  In  this  special  case,  is  on  the  bonndary  of  and  so  there  can  be  no  e-IMAC* 
for  any  e  >  0.  This  pathological  case  is  a  minor  technical  issne — we  demonstrate  in  Sec¬ 
tion  3.3. 1.4  an  algorithm  that  efficiently  establishes  a  non-trivial  lower  bonnd  if  snch  a 


bonnd  exists.  Second,  and  more  importantly,  the  additive  optimality  criterion  is  not  scale 
invariant^  whereas  mnltiplicative  optimality  is.  An  immediate  conseqnence  of  this  fact  is 
that  the  units  of  the  cost  determine  whether  a  particular  level  of  additive  accuracy  can  he 
achieved  whereas  multiplicative  costs  are  unitless.  While  mnltiplicative  optimality  has  the 
desirable  property  of  scale-invariance,  it  does  not  have  the  shift  invariance  possessed  by  ad¬ 
ditive  optimality.  We  view  scale  invariance  as  being  more  important  than  shift  invariance, 
since  if  the  cost  fnnction  is  scale  invariant  (as  is  the  case  for  metric-based  costs  inclnding 
the  Lp  family)  then  optimality  is  invariant  to  rescaling  of  the  featnre  space. 

For  the  remainder  of  this  chapter,  we  focns  on  establishing  e-mnltiplicative  optimality 


for  an  e-IMAC  (except  where  explicitly  noted)  and  define  =  L*  and  Gt  =  G')*''.  Finally, 


— 

we  relate  query  complexity  in  terms  of  L*,  which  will  be  convenient  to  reason  about  in  the 
sequel,  to  complexity  in  terms  of  e  which  is  the  stated  goal  of  e-IMAC  searchability. 


Remark  16.  Notice  that  for  sufficiently  small  e,  binary  search’s  L*  as  displayed  in  Equa¬ 
tion  (3.4)  is  0  (log  since  log(l  -|-  e)  ~  e.  Thus  demanding  query  complexity  that  is 
polynomial  in  log  -  is  equivalent  to  complexity  that  is  polynomial  in  L* . 


3.2.2  The  Reverse  Engineering  Problem 


As  stated  in  Remark  14,  Lowd  and  Meek  (2005b)  term  the  evasion  problem  “adversarial 


classifier  reverse  engineering  (ACRE)  ”  learnability.  While  the  requirement  of  ACRE  learn- 
ability  is  actually  to  evade  a  classifier,  their  approach  for  linear  classifiers  is  to  learn  the 
decision  boundary.  It  is  this  task  of  learning  the  classifier’s  decision  boundary  that  we  refer 
to  here  as  the  reverse  engineering  problem^  And  while  not  identical  problems,  this  notion 


of  reverse  engineering  is  certainly  related  to  the  goal  of  active  learning  (Schohn  and  Cohn 


2000). 


Efficient  query-based  reverse  engineering  of  an  /  G  is  clearly  sufficient  for  minimiz¬ 
ing  A  over  the  estimated  negative  space:  once  the  decision  boundary  has  been  determined, 
an  offline  optimization  of  the  cost  function  (without  submitting  further  queries)  yields  an 
e-IMAC.  However,  reverse  engineering  is  in  general  a  query-expensive  task  (since  it  relates 
to  approximating  volumes  which  is  hard  for  even  convex  bodies.  Dyer  and  Frieze|1992),  while 
Ending  an  e-IMAC  need  not  be:  the  requirements  for  finding  an  e-IMAC  differ  significantly 
from  the  objectives  of  reverse  engineering.  To  reverse  engineer,  the  attacker  must  approxi¬ 
mate  the  decision  boundary  globally;  to  evade,  the  attacker  need  only  locally  approximate 
the  decision  boundary  in  the  neighborhood  of  a  constrained  cost-optimizer. 

In  particular  our  algorithms  construct  queries  to  provably  find  an  e-IMAC  without  reverse 
engineering  the  classifier.  A  corollary  of  our  results  are  that  reverse  engineering  is  indeed 


'^Our  use  of  ‘reverse  engineering’  corresponds  to  deriving  insight  into  the  underlying  state  of  the  learner. 
Here  we  take  that  to  mean  the  effective  state  of  a  classifier  which  is  its  decision  boundary.  However  it 
could  also  apply  to  attacks  that  aim  to  determine  the  classifier’s  model  parameters  that  implicitly  define 
the  decision  boundary,  in  the  case  of  a  known  learning  algorithm  with  known  parametrization. 
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Negative  elass 

Figure  3.1:  Evading  a  classifier  with  a  con¬ 
vex  positive  class  to  optimize  an  Li  cost  in¬ 
volves  finding  the  vertex  of  the  ball  that  first 
pierces  the  negative  class. 


Figure  3.2:  Evading  a  classiher  with  a  con¬ 
vex  negative  class  to  optimize  an  Li  cost  is 
harder  than  the  convex  positive  case  since 
the  optimum  may  not  be  a  vertex  of  the  ball. 


signihcantly  more  complex  than  evasion  for  the  rather  general  case  of  the  positive  or  negative 
class  being  convex. 


3.3  Evasion  while  Minimizing  Li-distance 


This  section  develops  algorithms  for  achieving  e-IMAC  searchability  for  the  Li  cost 
function  and  the  family  of  convex-inducing  classifiers  that  partition  feature  space 

into  a  positive  class  and  a  negative  class,  one  of  which  is  convex.  As  discussed  above,  the  Li 


cost  is  natural  for  tasks  such  as  email  spam  hltering,  and  was  the  focus  of  Lowd  and  Meek 


(2005b)  in  their  original  work  on  evading  linear  classifiers.  The  convex-inducing  classihers 


include  the  class  of  linear  classifiers,  neural  networks  with  a  single  hidden  layer  (convex 
polytopes),  one-class  classihers  that  predict  anomalies  by  thresholding  the  log-likelihood  of 
a  log-concave  (or  uni-modal)  density  function,  the  one-class  SVM  with  linear  kernel,  and 
quadratic  classihers  of  the  form  x^Ax  -|-  b^x  -|-  c  >  0  for  semidehnite  A.  The  convex- 
inducing  classihers  also  include  classihers  whose  support  is  the  intersection  of  a  countable 
number  of  halfspaces,  cones,  or  balls. 

Restricting  if  to  be  the  family  of  convex-inducing  classihers  considerably  simplihes  the 
general  e-IMAC  search  problem,  as  depicted  in  F igures [3T] and When  the  negative  class 
A”  is  convex  (considered  in  Section  3.3.2),  the  problem  reduces  to  minimizing  a  convex 
function  A  constrained  to  a  convex  set;  if  A“  were  known  to  the  adversary  then  evasion 
would  reduce  to  a  convex  program.  The  key  challenge  is  that  the  adversary  only  has  access 
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to  a  membership  query  oracle  for  the  classiher.  When  the  positive  class  A’’*'  is  convex 
(considered  in  Section  3.3.1),  our  task  is  to  minimize  the  convex  function  A  outside  of  a 
convex  set;  this  is  generally  a  hard  problem  however  for  certain  cost  functions  including  Li 
cost,  it  is  easy  to  determine  whether  a  cost  ball  is  completely  contained  within  a  convex  set, 
leading  to  efficient  approximation  algorithms. 

Surprisingly  there  is  an  asymmetry  depending  on  whether  the  positive  or  negative  class 
is  convex.  When  the  positive  set  is  convex,  determining  whether  an  Li  ball  C 
only  requires  querying  the  vertices  of  the  ball,  of  which  there  are  2D.  When  the  negative 
class  is  convex,  however,  determining  whether  or  not  Be  H  X~  =  0  is  non-trivial  since  the 
intersection  need  not  occur  at  a  vertex  of  the  ball.  We  present  a  very  efficient  algorithm 
for  the  optimizing  the  Li  cost  when  is  convex  and  a  polynomial  random  algorithm  for 
optimizing  any  convex  cost  when  X~  is  convex. 

Our  algorithms  achieve  multiplicative  optimality  via  binary  search.  We  use  Cq  =  A(x“) 
as  an  initial  upper  bound  on  the  MAC  and  for  technical  reasons  (c/.  Section  3. 2. 1.2)  assume 
there  is  some  Cq  >  0  that  lower  bounds  the  MAC  (he.,  is  in  the  interior  of  X^). 


3.3.1  Convex  Positive  Classes 

Solving  the  e-IMAC  Search  problem  when  X^  is  generally  hard,  however  we  will  demon¬ 
strate  in  this  section  that  for  the  (weighted)  Li  cost  binary  search  algorithms  render  very 
efficient  solutions  with  almost-matching  lower  bounds  on  query  complexity.  We  now  explain 
how  we  exploit  the  properties  of  the  (weighted)  Li  ball  together  with  the  convexity  of  X^ 
to  efficiently  determine  whether  "Be  C  X^  for  any  C.  We  also  discuss  practical  aspects  of 
our  algorithm  and  extensions  to  other  Lp  cost  functions. 

The  existence  of  an  efficient  query  algorithm  relies  on  three  facts:  (1)  x"^  G  A"*";  (2) 
every  weighted  Li  cost  C-ball  centered  at  x^  intersects  with  X~  only  if  at  least  one  of  the 
ball’s  vertices  is  in  A“;  and  (3)  C-balls  of  weighted  Li  costs  only  have  2  ■  D  vertices.  The 

vertices  of  the  weighted  Li  ball  differ  from  x^  in  exactly  one  feature  d, 

x^  ±  -5d  .  (3.5) 

Cd 

We  now  formalize  the  second  fact,  which  follows  immediately  from  the  observation  that  the 
Li  ball  is  a  (convex)  polytope. 

Lemma  17.  For  all  C  >  0,  if  there  exists  some  x  G  X~  of  cost  C  =  Ac(x),  then  there  is  a 

vertex  of  the  Li  C-cost  ball  in  X~  that  trivially  also  achieves  cost  C . 


As  a  consequence  of  this  observation,  if  all  vertices  of  a  ball  Be  are  positive,  then  all  x 
with  Ax  <  C  are  positive  thus  establishing  C  as  a  new  lower  bound  on  the  MAC.  Conversely, 
if  any  vertex  of  Be  is  negative,  then  C  becomes  a  new  upper  bound  on  MAC.  Thus,  by 
simultaneously  querying  all  2  ■  H  equi-cost  vertices  of  Be,  we  establish  (7  as  a  new  lower  or 
upper  bound.  By  performing  a  binary  search  on  C,  using  the  geometric  mean  proposal  step 
of  c,  =  sjet  ■  cr  ,  we  iteratively  halve  the  multiplicative  gap  between  our  bounds  until  it 
is  within  a  factor  of  1  -|-  e  yielding  an  e-IMAC  of  the  form  of  Equation  (3.5). 
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Algorithm  3  Multi-line  Search 

1 

MLS  (>V,xA,x-,Co+,Co-,e) 

2 

X*  x~ 

3 

t  i —  0 

4 

while  /Cf  >  1  -f  e  do 

5 

Ct  ^  Vet  *  cr 

6 

for  all  e  G  W  do 

7 

Query  classifier:  ^  /  (x^ 

+  Ct®) 

8 

if  /g  =  ’  — ’  then 

9 

X*  ^  x^  +  Cte 

10 

For  each  i  G  W:  If  then  prune  i  from  W 

11 

Lazy  Querying:  break 

for-loop 

12 

end  if 

13 

end  for 

14 

c,+ 1  ^  c+  and  cr+,  ^  cr 

15 

if  Ve  G  W  /e  =  W  then 

Ct 

16 

6lS0  ^ —  Cf 

17 

t^t  +  i 

18 

end  while 

19 

return  x* 

A  general  form  of  this  MultiLineSearch  procedure  is  presented  as  Algorithm]^ which 
searches  along  all  unit-cost  search  directions  in  the  set  W.  The  set  W  represents  search 
directions  that  radiate  from  the  ball’s  origin  at  x^,  together  span  the  ball,  and  have  unit  cost. 
At  each  step,  MultiLineSearch  issues  at  most  |W|  queries  to  construct  a  bounding  shell 
(he.,  the  convex  hull  of  these  queries  will  either  form  an  upper  or  lower  bound  on  the  MAC) 
to  determine  whether  “Be  C  Once  a  negative  instance  is  found  at  cost  C,  we  cease 

further  queries  at  cost  C  since  a  single  negative  instance  is  sufficient  to  establish  a  upper 
bound.  We  call  this  policy  lazy  querying.  Further,  when  an  upper  bound  is  established  for  a 
cost  C  (a  negative  vertex  is  found),  our  algorithm  prunes  all  directions  that  were  positive  at 
cost  C.  This  pruning  is  sound  because  by  convexity  any  such  direction  is  positive  for  all  costs 
less  than  C  and  further  C  is  a  now  an  upper  bound  on  the  MAC  so  all  further  queries  will  be 
at  costs  less  than  C.  Finally,  by  performing  a  binary  search  on  the  cost,  MultiLineSearch 
finds  an  e-IMAC  with  no  more  than  |>V|  ■  queries  but  at  least  |W|  -|-  queries.  Thus, 
this  algorithm  is  O  (|>V|  ■  L^). 

Algorithm  uses  MultiLineSearch  for  (weighted)  Li  costs,  setting  W  to  be  the 
vertices  of  the  unit-cost  Li  ball  which  is  centered  at  x"^.  In  this  case,  the  search  issues  at 
most  2-D  queries  to  determine  whether  each  Be  C  and  Algorithm]^ has  an  exceptionally 
efficient  query  complexity  of  O  {L^  ■  D). 

3. 3. 1.1  iF-step  Multi-Line  Search 

We  now  develop  a  variant  of  the  multi-line  search  algorithm  that  better  exploits  pruning 
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Algorithm  4  Convex  A’*'  Set  Search 


ConvexSearch  (W, : 

C-  ^  Afx^) 


.A 


.£.C+) 


return:  MLS  (W,  x^,  x“,  C’*',  C",  e) 


Algorithm  5  A-Step  Multi-line  Search 
1:  KMLS  (>V,xA,x-,C'o+,C'o“,e,A) 

2:  X*  <(-  X“ 

3:  t  i —  0 

4:  while  /Cf  >  1  -|-  e  do 
5:  Choose  a  direction  e  G  W 

6:  B+  ^  Ct  and  A"  ^ 

7:  for  K  steps  do 

8:  B  ^  V5+  ■  B- 

9:  Query  classiher:  /e  ^  /  (x^  -|-  Be) 

10:  if  /e  =  then  A+  ■(—  B 

11:  else  B~  B  and  x*  ^  x^  -I-  Be 

12:  end  for 

13:  for  all  i  G  W\{e}  do 

14:  Query  classiher:  /;*  <(—  /  (x^  -|-  B^\) 

15:  if  //  =  ’  — ’  then 

16:  X*  ^  x"^  -h  {B^)i 

17:  For  each  k  G  W:  If  /^  =  ’-|-’  then  prune  k  from  W 

18:  Lazy  Querying:  break  for-loop 

19:  end  if 

20:  end  for 

21:  C't+i  ^ B 

22:  if  Vi  G  W  /^  =  then  B^ 

23:  else  ^ —  B~^ 

24:  t  —  t  -j-  1 

25:  end  while 
26:  return  x* 


to  further  reduce  the  (already  low)  query  complexity  of  Algorithm  [s] — we  call  this  variant 
A-step  MultiLineSearch.  The  original  MultiLineSearch  algorithm  makes  2-  IWI  si¬ 
multaneous  binary  searches.  Instead  of  this  breadth-hrst  search  we  could  search  sequentially 
depth-hrst,  and  still  obtain  a  best  case  of  fl  {D  +  LQ  and  worst  case  of  O  (L^  •  D)  but  for 
exactly  the  opposite  convex  bodies.  For  the  parallel  search  variant  described  above,  the  best 
case  is  an  elongated  ball  and  the  worst  case  is  a  rounded  ball  while  for  a  sequential  binary 
search  variant  these  cases  are  reversed.  We  therefore  propose  an  algorithm  that  mixes  these 
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strategies.  We  parametrize  the  mixture  via  a  parameter  K  to  be  set  later. 

At  each  phase,  the  W-step  MultiLineSearch  (displayed  as  Algorithm  chooses  a 
single  direction  e  and  queries  it  for  K  steps  to  generate  candidate  bounds  B~  and  on 
the  MAC.  The  algorithm  makes  substantial  progress  towards  reducing  Gt  without  querying 
other  directions,  since  it  is  a  depth-first  procedure.  In  a  breadth-first  step,  the  algorithm 
then  iteratively  queries  all  remaining  directions  at  the  candidate  lower  bound  B^ .  Again 
we  use  lazy  querying  and  stop  as  soon  as  a  negative  instance  is  found  since  B^  is  then  no 
longer  a  viable  lower  bound.  In  this  case,  although  the  candidate  bound  is  invalidated,  we 
can  still  prune  all  directions  that  were  positive  at  B^  including  direction  e.  Thus,  in  every 
iteration,  either  the  gap  is  decreased  or  at  least  one  search  direction  is  pruned.  We  now 
show  that  for  K  =  \ a/Z^]  ,  the  algorithm  achieves  a  delicate  balance  between  breadth-first 
and  depth-first  approaches  to  attain  a  better  worst  case  complexity  than  following  either 
approach  alone. 

Theorem  18.  Algorithm^  run  with  K  =  \^/L^,  will  find  an  e-IMAC  by  submitting  at 
most  O  (Lg  -(-  ^/Ll\W'\)  gueries. 

Proof.  We  consider  a  defender  that  is  choosing  the  classifier  (and  hence  the  oracle’s  responses 
to  the  attacker’s  queries)  adaptively  to  force  a  large  number  of  queries  on  the  attacker.  Our 
goal  is  to  bound  the  worst-case  number  of  queries. 

During  the  K  steps  of  binary  search,  regardless  of  how  the  defender  responds,  the  can¬ 
didate  gap  along  e  will  shrink  by  an  exponent  of  2“^;  he.. 


B+  \cv 


(3.6) 


The  primary  decision  for  the  defender  occurs  when  the  adversary  begins  querying  directions 
other  than  e.  At  iteration  t,  the  defender  has  two  options: 


Case  1  (t  G  Cl):  Respond  with  for  all  remaining  directions.  Here  the  bounds 
B^  and  B~  are  verified  and  thus  the  gap  is  reduced  by  an  exponent  of  2“^. 

Case  2  (t  E  C2):  Choose  at  least  one  direction  to  respond  with  Here  the 
defender  can  make  the  gap  decrease  by  a  negligible  amount  but  also  must 
choose  some  number  Et  >  1  oi  eliminated  directions. 

By  conservatively  assuming  the  gap  only  decreases  in  case  1,  he.,  if  t  G  Ci  we  Gt  =  Gl_^ 
or  otherwise  Gt  =  Gt-i,  the  total  number  of  queries  is  bounded  regardless  of  the  order  in 
which  the  cases  are  applied. 


101  <  m 


(3.7) 


since  we  need  a  total  of  binary  search  steps  and  each  case  1  iteration  is  responsible  for 
K  of  them. 
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Every  case  1  iteration  makes  exactly  K  +  |VVi|  —  1  queries.  The  size  of  Wt  (depending 
on  when  lazy-querying  activates)  is  controlled  by  the  defender,  but  we  can  bound  it  by  |W|. 
This  and  Equation  (3.7)  bound  the  number  of  queries  used  in  case  1  by 


teCi 

<  i.  +  A-+  [^1  .(IH'l-l) 


Each  case  2  iteration  uses  exactly  K  +  Et  queries  and  eliminates  Et  >  1  directions.  Since 
a  case  2  iteration  eliminates  at  least  1  direction,  IC2I  <  |W|  —  1  and  moreover,  J2teC2  — 
|W|  —  1  since  each  direction  can  only  be  eliminated  once.  Thus  the  number  of  queries  due 
to  case  2  is  bounded  by 

Q2  =  5^(i^  +  E,) 

t&C2 

<  (|W|  -  1)  (A' +  1)  , 


and  so  the  total  queries  used  by  Algorithm  is 


Q  —  Qi  +  Q2 

<  L.  +  (\^]  +  K+i)m , 

which  is  minimized  by  77  =  Substituting  this  for  K  and  using  <  y/L^  we 

have 


Q  <  L,  +  (2[/^l  +1)|>V|  . 

proving  that  Q  =  O  +  A/L|>V|j.  □ 

As  a  consequence  of  this  result.  Ending  an  e-IMAC  with  Algorithm  for  a  (weightec 
Li  cost  requires  only  O  (L^  +  y/L^D)  queries  in  the  worst-case.  In  particular.  Algorithm 
can  incorporate  i7-step  MultiLineSearch  directly  by  replacing  its  function  call  to  MLS 
to  a  call  to  KLMS  and  setting  K  =  \ y/Ll~\ . 


3.3. 1.2  Evading  Linear  Classifiers 


Lowd  and  Meek  (2005b)  originally  developed  a  method  for  reverse  engineering  linear 
classifiers  for  a  (weighted)  Li  cost.  First  their  method  isolates  a  sequence  of  points  from  x~ 
to  that  cross  the  classifier’s  boundary  and  then  it  estimates  the  hyperplane’s  parameters 
using  D  line  searches.  Their  algorithm  has  complexity  O  {D  ■  L^).  As  a  consequence  of 
our  new  ability  to  efficiently  minimize  the  Li  objective  for  any  convex  we  gain  an 
alternative  method  for  evading  linear  classifiers.  Because  linear  classifiers  are  a  special  case 
of  convex-inducing  classifiers,  our  i7-STEP  MultiLineSearch  algorithm  improves  slightly 
on  the  reverse-engineering  technique’s  query  complexity  and  applies  to  a  much  larger  class 
of  classifiers. 
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3.3. 1.3  Lower  Bounds 


The  following  result  establish  a  lower  bound  on  the  number  of  queries  required  by  any 
algorithm  to  hnd  an  e-IMAC  when  A’’*'  is  convex  for  a  (weighted)  Li  cost.  We  present 
the  result  for  the  case  of  multiplicative  optimality,  incorporating  a  lower  bound  r  >  0  on 
the  MAC  for  technical  reasons,  however  the  same  essential  argument  yields  the  same  lower 
bound  of  max{Zl,L+}  for  algorithms  achieving  r^-additive  optimality. 

Theorem  19.  Consider  any  D  G  N,  E  X  =  x'gA’,  0<r<i?  =  A(x“)  and 
e  E  (O,  ^  —  l) .  For  all  query  algorithms  submitting  N  <  max{Zl,  L^}  queries,  there  exist 
two  classifiers  inducing  convex  positive  classes  in  X  such  that 

1.  Both  positive  classes  properly  contain  T,.; 

2.  Neither  positive  class  contains  x"; 

3.  The  classifiers  return  the  same  responses  on  the  algorithm’s  N  queries;  and 
4-  The  classifiers  have  no  common  e-IMAC. 

That  is,  in  the  worst-case  all  query  algorithms  for  convex  positive  classes  must  submit  at 
least  max{Zl,  membership  queries  in  order  to  be  multiplicative  e-optimal. 

Proof.  Suppose  some  query-based  algorithm  submits  N  membership  queries  x^, . . .  ,x^  to 
the  classiher.  For  the  algorithm  to  be  e-optimal,  these  queries  must  constrain  all  consistent 
positive  convex  sets  to  have  a  common  point  among  their  e-IMAC  sets. 

First  consider  the  case  that  N  >  L^.  By  assumption,  then,  N  <  D.  Suppose  classiher  / 
responds  as 


/(x) 


-|-1 ,  if  A  (x)  <  R 
— 1  ,  otherwise 


For  this  classiher,  X^  is  convex,  “Br  C  X^,  and  x  ^  X^.  Moreover,  since  X~^  is  the  open 
ball  of  cost  R,  MAC  (/,  A)  =  R. 

Consider  an  alternative  classiher  g  that  responds  identically  to  /  for  x^, . . . ,  x^  but  has 
a  diherent  convex  positive  set  X^.  Without  loss  of  generality,  suppose  that  the  hrst  M  <  N 
query  responses  are  positive  and  the  remaining  are  negative.  Let  Q  =  conv  (x^, . . . ,  x^)  the 
convex  hull  of  the  M  positive  queries.  Now  let  X^  be  the  convex  hull  of  the  union  of  Q  and 
the  r-ball  around  x^  he.,  X^  =  comi  {Q  UBr).  Since  Q  contains  all  positive  queries  and 
r  <  R,  the  convex  set  X^  is  consistent  with  the  responses  from  /,  C  X^,  and  x“  ^  X^. 
Moreover  since  M  <  N  <  D,  Q  is  contained  in  a  proper  subspace  of  X  whereas  B^  is  not. 
Hence,  MAC  {g,  A)  =  r.  Since  the  accuracy  e  is  less  than  y  —  1,  any  e-IMAC  of  g  must  have 
cost  less  than  R  whereas  any  e-IMAC  of  /  must  have  cost  greater  than  or  equal  to  R.  Thus 
we  have  constructed  two  convex-inducing  classihers  /  and  g  with  consistent  query  responses 
but  with  no  common  e-IMAC. 

Now  consider  the  case  that  N  <  L,..  First,  recall  our  dehnitions:  Cq  =  R  is  the  initial 
upper  bound  on  the  MAC,  Cfi  =  r  is  the  initial  lower  bound  on  the  MAC,  and  Gt  =  Cfi /Cfi 
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is  the  gap  between  the  upper  bound  and  lower  bound  at  iteration  t. 
responds  with 


/M 


+1,  ifA(x‘)<\/c,-_i-C+_, 

— 1  ,  otherwise 


Here  the  defender  / 


This  strategy  ensures  that  at  each  iteration  Gt  >  \/Gt-i  and  since  the  algorithm  can  not 
terminate  until  G^  <  1  +  e,  we  have  N  >  from  Equation  (3.4).  As  in  the  N  >  case  we 
have  constructed  two  convex-inducing  classihers  with  consistent  query  responses  but  with 
no  common  e-IMAC.  The  hrst  classiher’s  positive  set  is  the  smallest  cost-ball  enclosing  all 
positive  queries,  while  the  second  classiher’s  positive  set  is  the  largest  cost-ball  enclosing 
all  positive  queries  but  no  negatives.  The  MAC  values  of  these  sets  differ  by  more  than  a 
factor  of  (1  -I-  e)  if  A^  <  so  they  have  no  common  e-IMAC.  □ 


Remark  20.  For  the  additive  and  multiplicative  cases  we  restrict  rj  and  e  to  the  inter¬ 
vals  (0,A(x“))  and  (0,A(x“)  /r)  respectively.  In  fact,  outside  of  these  intervals  the  query 
strategies  are  trivial.  For  either  rj  =  0  or  e  =  0  no  approximation  algorithm  will  terminate. 
Similarly,  for  rj  >  A  (x“)  or  e  >  ^  +  1,  the  instance  x“  is  a  near-optimal  instance  itself  so 
no  queries  are  required. 


Theorem  and  the  analogous  additive  result  show  that  //-additive  and  e-multiplicative 
optimality  require  H  (L+  -|-  and  H  (L*  -|-  D)  queries  respectively.  Thus,  we  see  that 
our  A'-step  MultiLineSearch  algorithm  (c/.  Algorithm]^  has  almost  optimal  query 
complexity  with  O  (L^  -|-  queries  for  weighted  Li  costs. 


3.3. 1.4  Generalizations 

We  now  consider  two  relaxations  that  require  minor  modihcations  to  Algorithms]^ and 
primarily  as  simple  preprocessing  steps. 


No  Initial  Lower  Bonnd.  To  hnd  an  e-IMAC  our  basic  algorithms  search  between  initial 
bounds  Gq  and  Gq  ,  but  in  general  Gq  may  not  be  known  to  a  real-world  adversary.  Algo¬ 
rithm  1^  SpiralSearch  can  efficiently  establish  a  lower  bound  on  the  MAC  if  one  exists. 
The  basic  idea  of  the  algorithm  is  to  perform  a  guess-then-halve  search]^  on  the  exponent, 
starting  from  the  upper  bound.  The  algorithm  also  eliminates  any  direction  that  exceeds 
the  current  upper  bound. 

At  the  iteration  of  SpiralSearch  a  direction  is  selected  and  queried  at  the  current 
lower  bound  of  2~‘^^Gq  .  If  the  query’s  response  is  positive,  that  direction  is  added  to  the  set 
V  of  directions  consistent  with  the  lower  bound.  Otherwise,  all  directions  in  V  are  discarded 
and  the  lower  bound  is  lowered  with  an  exponentially  decreasing  exponent.  Thus,  given  that 
some  lower  bound  Gq  >  0  does  exist,  one  will  be  found  relatively  quickly  in  O  (L^  -|-  D) 
queries,  for  W  equal  to  the  2  ■  D  directions  of  the  coordinate  axes  and  e  =  1  (corresponding 
to  no  dependence  on  e,  which  is  not  a  parameter  of  the  search). 

^The  inverse  of  more  well-known  guess-then-double  algorithms  which  are  used  for  example,  for  dynami¬ 
cally  allocating  arrays. 
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Algorithm  6  Spiral  Search 
1:  spiral  (>V,x'^,x“,C(7) 

2:  t  0  and  V  ^  0 
3:  repeat 

4:  Choose  a  direction  e  G  W 

5:  Query  classiher:  /e  ^  /  (x^  +  2“^^C'(^e) 

6:  if  fe  =  ’  — ’  then 

7 :  t  i —  t  +  1 

8:  V  ^  0 

9:  else 

10:  W  ^  W\{e} 

11:  V  ^  vu{e} 

12:  end  if 

13:  until  W  =  0 

14:  B+  ^  2-2* Cq- 
15:  return  {V,  B~^ ,Cq) 


Proposition  21.  For  any  classifier  with  convex  positive  class  X~^ ,  x^  G  ,  x”  ^  X^ , 
upper  bound  >  0,  greatest  lower  bound  0  <  Cfi  <  Cq,  and  set  of  covering  directions 
W,  Algorithm^  will  find  a  valid  lower  bound  B^  <  Cfi  in  at  most  |W|  +  log2  log2(C'(^/C(^) 
gueries. 

Proof.  Every  query  submitted  by  the  algorithm  results  in  either  the  pruning  of  a  direction 
from  W  or  the  halving  of  the  lower  bound  in  exponent  space.  There  are  at  most  |W|  events 
of  the  hrst  kind.  To  analyze  the  maximum  number  of  steps  of  the  second  kind  consider  that 
we  stop  when  2-2*5+  <  Cfi .  This  is  equivalent  to  t  >  log2  log2(C'(7/C(^).  Combining  these 
query  count  bounds  yields  the  result.  □ 

Thus  this  algorithm  can  be  used  as  a  precursor  to  any  of  the  previous  searched  and 
can  be  adapted  for  additive  optimality  by  halving  the  lower  bound  instead  of  the  exponent. 
Furthermore,  the  search  directions  pruned  by  SpiralSearch  are  also  invalid  for  the  sub¬ 
sequent  MultilineSearch,  so  the  set  V  returned  by  SpiralSearch  can  be  used  as  the 
set  W  for  the  subsequent  search,  amortizing  some  of  the  multiline  search’s  effort. 

No  Initial  Negative  Example.  Our  algorithms  can  also  naturally  be  adapted  to  the  case 
when  the  adversary  has  no  negative  example  x-.  This  is  accomplished  by  an  inverse  process 
of  the  previous  no  lower  bound  generalization  through  a  guess-than-double  type  process  of 
querying  Li  balls  of  doubly  exponentially  increasing  radii  until  a  negative  instance  is  found. 
During  the  iteration,  we  probe  along  the  search  direction  at  a  cost  until  a 

negative  example  is  found.  Once  we’ve  obtained  a  negative  example  (having  probed  for 
T  iterations),  we  must  have  22  <  MAC(/,  A)  <  22  .  Thus  we  can  now  perform  our 

^In  the  pathological  case  of  no  existing  lower  bound,  this  algorithm  would  not  terminate.  In  practice, 
the  search  should  be  terminated  after  sufficiently  many  iterations. 
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Algorithm  7  Intersect  Search 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 


Intersect  Search  Q  =  {x-^  G  C) 
for  all  s  =  1 . . .  T  do 
Generate  2N  samples 
Choose  x-^  from  Q 
x.^  ^  HitRun  Q,  x-^) 

If  3x-^,  A  (x-^)  <  G,  then  terminate  the  for-loop 
Put  samples  into  2  sets  of  size  N 
'R-  ^  and  5  ^ 

Compute  'Hz‘’  using  Equation  (3.9) 

p,  ^  -ps-i  p  ^ 


Keep  samples  in 

{xe  Snv^} 

end  for 

Return:  the  discovered  witness  [xj,V^,  Q];  or  ‘No  Intersect’ 


multi-line  search  with  Cq  =  2^ 


and  Cq  =  2^ 


This  precursor  step  requires  at  most 
2  ■  D  [log2  log2  MAC  (/,  A)]  to  prepare  the  MultilineSearch  algorithm. 


3.3.2  Convex  Negative  Classes 


In  this  section  we  consider  minimizing  a  weighted  Li  cost  A  (cf.  Equation |3.1[)  when  the 
feasible  set  X~  is  convex.  Although  any  convex  function  can  be  efficiently  minimized  within 
a  known  convex  set  (e.g'.,  using  the  Ellipsoid  Method  and  Interior  Point  methods,  Boyd 


and  Vandenberghe  2004),  in  the  context  of  the  evasion  problem  the  convex  set  is  accessible 


only  via  membership  queries.  We  use  a  randomized  polynomial  algorithm  due  to  Bertsimas 
and  Vempala  (2004)  to  minimize  the  cost  function  A  given  an  initial  point  x~  G  X~ .  For 
any  hxed  cost  we  use  their  algorithm  to  determine  (with  high  probability)  whether  X~ 
intersects  with  Bed  thut  is,  whether  or  not  G*  is  a  new  lower  or  upper  bound  on  the  MAC. 
Again  by  applying  a  binary  search,  we  hnd  an  e-IMAC  with  a  high  degree  of  conhdence  in 
no  more  than  repetitions.  We  now  focus  only  on  weighted  L\  costs  and  return  to  more 
general  cases  in  Section  13.4.2 


The  hnal  Algorithm]^ runs  with  polynomial  query  complexit j|^ G*  [D^L^).  The  following 
sections  provide  a  detailed  sketch  of  the  steps  followed  by  the  algorithm. 


3.3. 2.1  Procedure  to  Determine  Whether  Convex  Sets  Intersect 


We  begin  by  outlining  the  query-based  procedure  of  Bertsimas  and  Vempala  (2004)  for 
determining  whether  two  convex  sets  {e.g.,  X~  and  BcO  intersect.  Their  Intersect- 
Search  procedure  (presented  here  as  Algorithm  is  a  randomized  Ellipsoid  method  for 


^O*  (•)  denotes  the  standard  complexity  notation  O  (•)  up  to  logarithmic  factors. 
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Algorithm  8  Hit-and-Run  Sampling 
1:  HitRun  ("P,  x°) 

2:  for  alH  =  1 . . .  A  do 
3:  Pick  a  random  direction: 

4:  Vj  ~  N  (0,  1) 

5:  V  ^  Y.j 

6:  Find  Ui  and  UJ2  s.t. 

7:  —  wiv  ^  V  and 

8:  X*"l  +  U2V  ^  V 

9:  repeat 

10:  oj  ~  Unit  (— cji,  UJ2) 

11:  X*  X®“^  +  CJV 

12:  if  cj  <  0  then  uji  < - uj 

13:  else  UJ2  ui 

14:  until  X*  G  P 

15:  end  for 
16:  Return:  x^ 


determining  whether  there  is  an  intersection  between  two  bounded  convex  sets  P  and  B  with 
the  following  properties:  P  is  only  accessible  through  membership  queries  and  B  provides  a 
separating  hyperplane  for  any  point  outside  it.  The  reader  should  view  P  and  B  as  some¬ 
thing  like  X~  and  ‘Bet  which  certainly  satisfy  these  properties.  Their  technique  uses  efficient 
query-based  approaches  to  uniformly  sample  from  the  convex  set  P  to  obtain  sufficiently 
many  samples  such  that  cutting  P  through  the  centroid  of  these  samples  with  a  separating 
hyperplane  from  B  will  signihcantly  reduce  the  volume  of  P  with  high  probability.  The 
technique  thus  constructs  a  sequence  of  progressively  smaller  feasible  sets  P*  C  P^“^  until 
either  it  hnds  a  point  in  P  fl  13  or  it  is  highly  unlikely  that  the  sets  intersect.  We  now  detail 
how  we  use  their  technique  to  efficiently  evade  hltering. 

So  far  we  have  reduced  our  problem  to  hnding  the  intersection  between  X~  and  Bet. 
An  important  initialization  step,  however,  is  due  to  the  algorithm  being  designed  to  test  the 
intersection  of  hounded  convex  sets,  while  X~  may  be  unbounded.  Let  R  =  2A(x“)  and 
be  the  Li-ball  of  radius  2R  centered  at  x~.  Since  we  are  minimizing  a  cost,  we 
can  consider  the  set  P°  =  X~  fl  !B2i?(x~),  which  is  a  subset  of  X~  containing  Bet  and  thus 
also  the  intersection  X~  fl  Bet  if  it  exists — since  <  A(x“).  A  hnal  initial  remark  is  that 
we  also  assume  that  there  is  some  r  >  0  such  that  there  is  an  Li-ball  of  radius  r  contained 
in  the  convex  set  X~ . 

Before  detailing  the  IntersectSearch  procedure  (c/.  Algorithm]^,  we  summarize  the 
HIT-AND-RUN  random  walk  technique  introduced  by  Smith  (1996)  (c/.  Algorithm]^,  which 
is  the  backbone  of  IntersectSearch  and  is  used  to  sample  uniformly  from  a  bounded 
convex  body.  Given  an  instance  x-^  G  P®“^,  hit-and-run  selects  a  random  direction  v 
through  x-^  (we  return  to  the  selection  of  v  in  Section  3. 3. 2. 2).  Since  P^“^ 
convex  set,  the  set  G  =  {cu 


is  a  bounded 


x-^  +  CUV  G  P®  is  a  bounded  interval  of  all  feasible  points 
along  direction  v  through  xL  Sampling  cu  uniformly  from  G  (using  rejection  sampling)  yields 
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the  next  step  of  the  random  walk;  x-^  +  cav.  Under  the  appropriate  conditions  (addressed 


in  Section  3. 3. 2. 2),  the  hit-and-run  random  walk  generates  a  sample  uniformly  from  the 


convex  body  after  O*  {D^)  steps  (Lovasz  and  Vempala,  2004). 


Using  HIT-AND-RUN  we  obtain  samples  from  C  X~  and  determine  if  any 

satisfy  A(x-^)  <  Uh  If  so,  x-^  is  in  the  intersection  of  and  'Be*  and  the  procedure  is 
complete.  Otherwise,  our  goal  is  to  signihcantly  reduce  the  size  of  without  excluding 
any  of  ,  so  that  our  sampling  concentrates  toward  the  intersection  (if  it  exists) — for  this 
we  need  a  separating  hyperplane  for  Bet-  For  any  point  y  ^  Bet,  the  (sub)gradient  of  the 
weighted  Li  cost  at  y  is  given  by  with  components 


ft/  =  tf  agn  (i/f-Xy) 


(3.8) 


This  is  the  normal  to  a  separating  hyperplane  for  y  and  Bet . 

To  achieve  efficiency,  we  choose  a  point  z  G  so  that  cutting  through  z  with 
the  hyperplane  eliminates  a  signiheant  fraction  of  To  do  so,  z  must  be  centrally 

located  within  — we  use  the  empirical  centroid  of  half  of  our  samples  in  z  =  A 


(the  other  half  will  be  used  in  Section  3. 3. 2. 2).  We  cut  ^  with  the  hyperplane  through 
z;  that  is,  =  'P^~^  fl  "Hz  where  "Hz  is  the  halfspace 


Hz  =  {x  I  x’^h"  <  z^h"} 


As  shown  by  Bertsimas  and  Vempala  (2004),  this  cut  achieves  vol  (H®)  <  |vol  (H^ 
high  probability  so  long  as  N  =  O*  {D 


(3.9) 


with 


and  ^  is  sufficiently  round  (see  Section  3.3.2.2). 


Observing  that  the  ratio  of  the  volumes  between  the  initial  circumscribing  and  inscribing 
balls  of  the  feasible  set  is  ,  the  algorithm  can  terminate  after  T  =  O  (T*log  unsuc¬ 
cessful  iterations  with  a  high  probability  that  the  intersection  is  empty. 

Because  every  iteration  in  Algorithm  requires  N  =  O*  (D)  samples,  each  of  which 
need  K  =  O*  {D^)  random  walk  steps,  and  there  are  O*  {D)  iterations,  the  total  number  of 
membership  queries  required  by  Algorithm  is  O*  (D^). 


3. 3. 2. 2  Efficient  Sampling  from  Convex  Bodies  with  Membership  Oracles 

Until  this  point,  we  assumed  the  hit-and-run  random  walk  efficiently  produces  uni¬ 
formly  random  samples  from  any  bounded  convex  body  V  accessible  through  membership 
queries.  However,  if  the  body  is  severely  elongated,  randomly  selected  directions  will  rarely 
align  with  the  long  axis  of  the  body  and  our  random  walk  will  take  small  steps  (relative 
to  the  long  axis)  and  mix  slowly.  Essentially,  we  require  that  the  convex  body  be  well- 
rounded.  More  formally,  for  the  walk  to  mix  effectively,  we  need  the  convex  body  V  to 


E- 


X~R 


(v^  {x-EYM^])y 


is  bounded 


be  near-isotropic;  i.e.,  for  any  unit  vector  v, 
between  1/2  and  3/2  of  vol  {V). 

If  V  is  not  near-isotropic,  we  must  rescale  X  with  an  appropriate  affine  transformation 
V  so  the  resulting  body  V'  is  near-isotropic.  With  sufficiently  many  samples  from  V  we  can 
estimate  V  as  the  empirical  covariance  matrix.  However,  instead  of  rescaling  X  explicitly. 


we  do  so  implicitly  using  a  technique  described  by  Bertsimas  and  Vempala  (2004).  To  do 
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Algorithm  9  Convex  X  Set  Search 

1:  SetSearch  (V,  Q  =  {x-^  G  V},  Cq  ,  Cq,  e) 

2:  X*  x“  and  t  ^  0 
3:  while  Cf  /C^  >  1  +  e  do 
4:  C,  ^ 

5:  [x*,  P',  Q!\  Intersect  Search  ("P,  Q,  C) 

6:  if  intersection  found  then 

7:  A  (x*)  and  •(— 

8:  V  and  Q  ^  Q' 

9:  else 

10:  C7|_i  ^  and  Ct 

11:  end  if 

12:  t  i —  t  A  1 

13:  end  while 
14:  Retnrn:  x* 


so,  we  maintain  a  set  Q  of  sufficiently  many  uniform  samples  from  the  body  and  in  the 
HIT-AND-RUN  algorithm  we  sample  directions  based  on  this  set.  Intuitively,  because  the 
samples  in  Q  are  distributed  uniformly  in  the  directions  we  sample  based  on  the  points 
in  Q  implicitly  reflect  the  covariance  structure  of  P*.  This  is  equivalent  to  sampling  the 
direction  from  a  normal  distribution  with  the  covariance  of  P. 

We  must  ensure  that  Q  has  sufficiently  many  samples  from  after  each  cut  ^ 
P^“^  n  '}i:z,s.  Recall  that  we  initially  sampled  2N  points  from  P^“^  using  our  hit-and-run 
procedure — half  of  these  were  used  to  estimate  the  centroid  z®  for  the  cut  and  the  other 
half,  S,  are  used  to  repopulate  Q  after  the  cut.  Because  S  contains  independent  uniform 
samples  from  P*“^,  those  in  P^  after  the  cut  constitute  independent  uniform  samples  from 
P^  (along  the  same  lines  of  rejection  sampling).  By  choosing  N  sufficiently  large,  our  cut 
will  be  sufficiently  deep  and  we  will  have  sufficiently  many  points  to  resample  P^  after  the 
cut. 

Finally,  before  we  start  this  resampling  procedure,  we  need  an  initial  set  Q  of  uniformly 
distributed  points  from  P°  but,  in  our  problem,  we  only  have  a  single  point  x~  G  X~ . 
Fortunately,  there  is  an  iterative  procedure  for  putting  our  convex  set  P°  into  a  near-isotropic 
position — the  RoundingBody  algorithm  described  by  [Lovasz  and  Vempala  (2003)  uses 
O*  {D'^)  membership  queries  to  transforms  the  convex  body  into  near-isotropic  position. 
We  use  this  as  a  preprocessing  step  for  Algorithms  and  that  is,  given  X~  and  x“  G  X~ 
we  make  P°  =  X~  fl  !B2r(x~)  and  then  use  the  RoundingBody  algorithm  to  produce 
Q  =  {x-^  G  P°}.  These  sets  are  then  the  inputs  to  our  search  algorithms. 


3.3. 2.3  Optimization  over  Li  Balls 

We  now  revisit  the  outermost  optimization  loop  for  searching  for  the  minimum  feasible 
cost  and  suggest  improvements — if  naively  implemented  as  described,  the  algorithm  does 
work. 
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First,  since  x^,  x“  and  are  the  same  for  every  iteration  of  the  optimization  procedure, 
we  only  need  to  run  the  RoundingBody  procedure  once  as  a  preprocessing  step.  The 
set  of  samples  {x-^  G  it  produces  are  sufficient  to  initialize  IntersectSearch  at  each 
stage  of  the  binary  search  over  C*. 

Second,  the  separating  hyperplane  given  by  Equation  (3.8)  does  not  depend  on  the 
target  cost  C*  but  only  on  x^,  the  common  center  of  all  the  Li  balls.  In  fact,  the  separating 
hyperplane  at  point  y  is  valid  for  all  weighted  Li-balls  of  cost  C  <  A  (y).  Further,  if  (7  <  (7*, 
we  have  “Bq  C  Thus,  the  final  state  from  a  successful  call  to  IntersectSearch  for 
the  (7*-cost  ball  can  serve  as  the  starting  state  for  any  subsequent  call  to  IntersectSearch 
for  all  C  <CK 

These  improvements  are  reflected  in  our  final  procedure  SetSearch  in  Algorithm 


3.4  Evasion  while  Minimizing  L^-distances 


While  the  focus  of  Section  like  that  of  |Lowd  and  Meek  (2005b),  is  attacks  that 
almost-minimize  Li  cost,  an  adversary  may  instead  value  some  other  cost  function.  In  this 
section  we  consider  attacks  for  evading  convex-inducing  classifiers  that  almost-minimize  Lp 
costs  for  p  1.  For  certain  values  of  p  we  show  that,  depending  on  whether  the  positive 
or  negative  class  is  convex,  either  the  evasion  problem  requires  an  exponential  number  of 
queries  or  it  can  be  efficiently  solved  by  our  existing  algorithms.  For  other  Lp  costs,  finding 
efficient  evasion  procedures  remains  an  open  question. 


3.4.1  Convex  Positive  Classes 

We  now  consider  the  application  of  the  MultiLineSearch  and  R-step  MultiLine- 
Search  algorithms  for  evading  convex- inducing  classifiers  under  Lp  costs  when  p  7^  1. 

3. 4. 1.1  Multiline  Search  for  p  G  (0, 1) 

In  the  case  where  p  <  1,  a  simple  reduction  holds.  Since  a  Li-ball  of  radius  R  bounds 
radius- i?  Lp-balls  for  all  p  <  1,  we  can  simply  use  our  existing  algorithms  with  the  2  ■  D 
vertices  of  the  hyperoctahedron  (the  Li  ball)  as  search  directions,  to  find  an  e-IMAC  while 
submitting  the  same  number  of  queries  as  before. 

3.4. 1.2  Multiline  Search  for  p  G  [1,  C)o] 

If  the  level  of  approximation  e  is  permitted  to  increase  with  dimension  D,  then  positive 
results  are  possible  using  our  existing  algorithms.  The  quality  of  our  result,  measured  by 
the  level  of  near-optimality  we  can  guarantee  via  a  range  on  e,  depends  on  how  well  the  Lp 
ball  is  approximated  by  the  Li  ball. 

Theorem  22.  Using  the  2  •  D  axis-parallel  search  directions,  any  of  our  multiline  search 
algorithms  efficiently  find  an  e-IMAC  for 
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Figure  3.3:  Relating  upper  and  lower  bounds  on  Li  cost  found  by  multiline  search,  to  bounds 
on  L2  cost. 


(i)  Lp  cost,  for  p  e  [1,  00),  for  all  e  >  —  1;  and 

(ii)  Loo  cost  for  all  e  >  D  —  1. 


Proof.  First  observe  that  for  p  G  (1,  00),  the  largest  Lp  ball  that  is  contained  inside  a  unit  Li 
ball  has  radius  Zi)d-p)/p  since  the  Lp  and  Li  balls  must  meet  at  the  point  D~^\.  by  symmetry 


and  the  Lp-norm  of  this  point  is  Up  =  Similarly  the  largest  L^ 


ball  to  achieve  this 


feat  has  radius  a^o  =  D~^. 

Now  consider  running  multiline  search  with  the  set  of  2  ■  H  axis-parallel  directions  as  the 
search  direction  set  W,  until  the  Li  cost  bounds  yield  the  stopping  criterion  Cq  /Cq  < 
Notice,  as  depicted  in  Figure  3^  for  the  p  =  2  case,  that  while  the  upper  bound  on  Li  cost 
of  Cq  establishes  the  identical  bound  on  Lp  cost,  the  same  is  not  true  for  the  lower  bound. 
The  lower  bound  on  Lp  cost  achieved  by  the  search  is  simply  apC^  where  Up  is  dehned 
above  for  p  e  (1,  cx)]. 

Thus  we  can  guarantee  that  the  search  has  found  an  e-IMAC  provided  that  Cq  /{apClf )  < 
1  -|-  e  which  holds  if  -|-  e')  <  1  -|-  e.  If  e'  is  taken  so  that  Op  >  (1  -|-  e')~^  then  we  have 

an  e-IMAC  if  (1  -|-  e')^  <  1  -|-  e.  This  condition  is  satished  by  taking  e'  =  \/T+~e  —  1. 

Thus  provided  that  Op  >  1/(1  -|-  e'),  running  a  multiline  search  with  query  complexity 
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in  terms  of  the  following  quantity  is  sufficient  to  find  an  e-IMAC: 

L,  =  O  (log  +  ,,,)) 

=  0{L,)  . 

And  so  the  procedure  still  has  the  same  polynomial  query  complexity  in  D  and  L^,  but  with 
worse  constants.  Finally  we  have  ap  >  1/(1  +  e')  for  p  G  [1,  cxd)  iff  e'  >  —  1,  and  for 

p  =  cx)  iff  e'  >  H  —  1.  □ 

These  results  incur  the  same  query  complexities  as  for  the  Li  case  since  we  are  using 
the  same  set  of  search  directions]^  Better  results,  in  the  sense  of  achieving  lower  e’s,  are 
possible  by  using  additional  search  directions  (the  intuition  being  that  more  directions  better 
approximate  Lp  balls).  The  cost  of  expanding  the  search  set  W  is  that  the  query  complexity 
increases. 

If  e  is  constant  wrt  the  dimension  D  then  evasion  for  p  G  [2,  oo]  requires  an  exponential 
number  of  queries  in  D:  no  efficient  algorithm  exists. 

Theorem  23.  For  any  D  >  0,  any  initial  bounds  0  <  Cq  <  Cq  on  the  MAC,  and  0  <  e  <  1, 
all  algorithms  must  submit  at  least  af  membership  gueries  (where  >  1)  in  the  worst  case 
to  be  e-multiplicatively  optimal  on  all  classifiers  with  convex  positive  classes,  for  L^o  costs. 

Proof.  We  proceed  by  constructing  two  convex  positive  sets  consistent  with  the  responses  of 
the  oracle,  but  with  MAC’s  that  are  sufficiently  different  that  no  algorithm  could  simultane¬ 
ously  find  an  e-IMAC  for  both  without  submitting  many  queries.  The  first  convex  positive 
set  Vi  is  simply  the  L^o  ball  of  cost  one. 

Consider  the  M  queries  made  by  the  algorithm  relative  to  that  have  cost  at  most 
1.  Each  such  query  must  fall  in  one  of  the  2^  octants  of  the  L^o  ball  about  x^.  Let  us 
regard  an  octant  as  ‘covered’  if  there  is  one  or  more  queries  in  it.  Define  our  second  convex 
positive  set  V2  as  the  convex  hull  of  all  the  octants  covered  by  the  M  queries.  Vi  responds 
positively  for  these  queries  (and  negatively  for  all  others),  while  V2  certainly  contains  the 
M  queries  and  so  also  positively  responds  to  these.  It  also  responds  negatively  to  any  other 
queries.  Thus  both  candidate  positive  sets  are  convex,  and  responds  consistently  with  any 
sequence  of  queries.  The  MAC  for  Vi  is  trivially  one.  We  now  consider  when  the  MAC  for 
V2  is  much  smaller,  providing  a  separation  between  e-IMAC’s  for  the  two  candidate  sets. 

Each  of  the  covered  octants  defining  V2  can  be  identified  with  a  point  in  the  D-cube 
{0, 1}^.  Let  C  C  {0, 1}^  be  this  subset  of  points.  Suppose  C  is  a  iC-covering  of  the  D-cube 

^As  noted,  the  actual  number  of  queries  increases,  but  only  by  constant  factors.  For  example,  the  basic 
multiline  search’s  number  of  queries  increases  by  an  additional  D  queries:  accounting  for  constants  including 
in  the  logarithms’  bases,  L^i  =  1  +  L,,. 
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but  not  a  {K  —  l)-covering,  for  integer  K  wrt  the  hamming  distance.  Then  there  must  be  at 
least  one  element  v  of  the  D-cnhe  that  is  at  least  K  hamming  distance  from  every  member 
of  C;  he.,  v  would  not  be  covered  by  any  {K  —  l)-ball  centered  around  a  point  in  C.  v 
corresponds  to  an  un-occupied  octant,  and  all  occupied  octants’  representatives  differ  by  at 
least  K  coordinates  from  v.  WLOG  assume  that  all  octants  that  differ  by  K  coordinates  are 
in  fact  in  the  covering.  These  octants  border  a  face  of  the  convex  hull  V2  which  separates 
V  from  7^2 •  For  simplicity  of  explanation,  we  assume  WLOG  that  v  is  the  vector  of  all 
ones.  The  corners  of  this  face  then  correspond  to  all  vectors  that  have  K  zeros  and  D  —  K 
ones.  Since  there  must  be  a  total  of  {D  —  K)  (^)  ones  distributed  uniformly  among  the  D 
coordinates  over  all  the  corners  of  the  face,  the  midpoint  of  this  face  is  then  given  by 


1  /'D-K/'D\  D-KfD\  D-Km\\ 


D-K 

D 


(1,1,...,!) 


which  has  an  L^o  cost  of  {D  —  K)/D.  By  the  symmetry  of  V2,  this  midpoint  minimizes  cost 
over  V2-  That  is,  the  MAG  under  V2  is  {D  —  K)/D.  By  contrast  the  MAG  under  Vi  is 
simply  one. 

Given  the  consistency  of  the  two  convex  positive  sets’  responses,  this  implies  that  any  al¬ 
gorithm  that  submits  insufficient  queries  for  a.K  —  1  covering  cannot  achieve  a  multiplicative 
approximation  better  than  1  +  e>  D/{D  —  K).  Solving  for  K  this  yields 

K  <  D  ,  (3.10) 

“  1+e  ^  ^ 

which  relates  a  desirable  e  to  the  radius  of  the  covering;  he.,  for  better  approximations,  a 
larger  (lower  radius)  covering  is  required. 

We  next  consider  the  number  of  queries  required  to  achieve  a  covering  of  a  given  radius 
K.  Gonsider  such  a  cover.  Each  element  of  the  covering  covers  exactly  (^)  dements 
of  the  H-cube.  Since  the  cube  has  2^  vertices,  the  covering’s  cardinality  must  be  at  least 
/pN  to  cover  the  entire  H-cube.  To  further  bound  this  quantity  it  suffices  to  use  the 

bound 


< 


(3.11) 


where  0  <  5  <  1/2  and  H  (5)  =  — 51og2  5  —  (1  —  5)  log(l  —  5)  is  the  entropy.  Let  K  =  D/2  so 
that  having  no  iL  — 1  cover  implies  no  approximation  better  than  e  >  1.  By  Equations  (3.10) 

and  (3.11 ),  any  algorithm  must  submit  enough  queries  so  that  at  least  vertices  of 

the  hypercube  are  covered  to  achieve  0  <  e  <  1.  By  construction  of  the  ‘covered’  octants,  an 
algorithm  would  need  to  submit  at  least  this  many  queries  to  achieve  (1  -|-  e)-multiphcative 
optimality.  However,  since  H  (5)  <  1  for  6  <  1/2,  we  have  that  the  query  complexity  is 


M  =  n  (af )  for  a,  =  2^-^(t^) 


>  1. 


□ 


We  can  apply  a  similar  argument  for  other  Lp  costs  with  p  >  1,  to  yield  similar  negative 
query  complexity  results. 
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Corollary  24.  For  any  D  >  any  initial  bounds  0  <  Cq  <  Cq  on  the  MAC,  and 
0  <  e  <  —  1,  all  algorithms  must  submit  at  least  of  membership  queries  (where  a^>  1) 

in  the  worst  case  to  be  e-multiplicatively  optimal  on  all  classifiers  with  convex  positive  classes, 
for  Lp  costs  with  p  >  1. 

Proof.  The  same  argument  as  used  in  Theorem  applies  here,  with  minor  modifications. 
Our  first  convex  positive  class  Vi  is  now  the  unit-cost  Lp  ball.  Again  we  can  consider  octants 
being  covered  by  queries  falling  within  Vp,  however  octants  are  not  restricted  to  this  Lp  ball 
too,  and  again  V2  is  taken  as  the  convex  hull  of  the  covered  octants. 

The  MAC  under  Vi  is  again  one,  however  the  derivation  for  7^2 ’s  MAC  changes  slightly. 
Now  the  corners  defining  the  face  separating  v  from  V2  are  no  longer  the  vectors  of  K  zero’s 
and  D  —  K  one’s;  instead  the  one’s  are  replaced  by  some  (3  such  that  the  resulting  vector 
has  Lp  cost  one:  the  appropriate  value  of  fi  is  {D  —  As  a  result,  the  midpoint  of  the 

face  is  now  (D  —  / D1  which  has  an  Lp  cost  of  {{D  —  K) / representing  the 

MAC  under  7^2  • 

Thus  without  forming  a  7^  —  1  covering,  the  best  approximation  achievable  is  1  -|-  e  > 
{D/{D  —  .  Solving  for  K  this  yields 

A-  <  . 

“  {1  +  e)p/(p-b 


Now  setting  K  =  dim/2  as  before  yields  that  for  e  satisfying  this  relation,  submitting 

fewer  than  M  =  Vt  {<y^)  queries  for  =  2  v  )  cannot  achieve  better  than  a  (1  +  e)- 
approximation,  where  f3,,^p  =  (1  -|-  As  before  we  have  that  a‘(  >  1  since  the 

argument  to  the  entropy  is  at  least  1/2.  Finally  1/2  lower  bounded  this  argument  to  the 
entropy  implies  a  bound  of  2^^“^^/^  —  1  on  e,  under  which  the  result  holds,  where  above  the 
bound  was  simply  one.  □ 


Figure 


3.4 


plots  the  range  e  G  (O,  2*^^“^)/^’  —  l)  as  a  function  of  p.  As  p  — )■  oo  this  upper 
bound  quickly  approaches  1,  which  coincides  with  the  upper  bound  derived  for  the  L^o  cost. 

While  our  positive  results  for  the  convex  positive  class  p  >  1  case  provide  approximations 
e  that  must  increase  fairly  quickly  with  D,  it  is  most  important  to  understand  the  query 
complexity  of  good  approximations  for  small  e.  Theorem  ^  and  the  corollary  provide  such 
results. 

Notably  for  the  L2  cost,  the  upper  bound  is  \/2  —  1  0.414;  for  this  case  it  is  possible 

to  derive  a  similar  exponential  query  complexity  result  that  holds  for  all  e  by  appealing  to 
a  result  of  Wyner  (1965)  on  covering  numbers  for  the  surface  of  the  hypersphere. 


Theorem  25.  For  any  D  >  1,  any  initial  bounds  0  <  Cfi  <  Cq  on  the  MAC,  and  0  <  e  < 
^  —  1,  all  algorithms  must  submit  at  least  ae  membership  queries  (where  > 

1)  in  the  worst  case  to  be  e-multiplicatively  optimal  wrt  the  L2  cost,  on  all  classifiers  with 
convex  positive  classes. 


We  provide  a  proof  sketch  here.  Once  again  the  argument  is  to  construct  two  candidate 
positive  convex  classes  that  produce  identical  responses  but  have  very  different  MAC’s.  One 
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Approximations  Needing  Exponentiai  Queries 


1  2  5  10  20 


p  (log  scale) 


Figure  3.4:  The  values  of  e  (shown  in  gray),  for  each  p  >  1,  for  which  minimizing  Lp  cost 
requires  exponential  numbers  of  queries  according  to  Corollary  24  The  cost  parameter  p  is 
shown  on  a  log  scale.  The  upper-bound  quickly  approaches  unity  as  p  — )•  oo. 


class  corresponds  to  the  hypersphere  itself,  while  the  other  is  simply  the  convex  hull  of  the 
queries  that  fall  within  the  hypersphere.  In  the  best  case  (for  the  attacker)  these  queries  are 
on  sphere’s  boundary.  It  is  easy  to  see  that  the  achieved  approximation  corresponds  to  the 
greatest  height  of  the  spherical  caps  defined  by  the  difference  between  the  hypersphere  and 
the  convex  hull.  Minimizing  the  spherical  cap  height  (maximizing  the  accuracy  of  approxi¬ 
mation)  corresponds  to  covering  the  sphere’s  surface;  the  size  of  covers  grows  exponentially 
with  D. 

We  can  compare  this  proof  technique  with  the  technique  used  for  general  Lp  costs.  The 
looseness  of  the  general  technique  comes  from  covering  only  (hyper)octants — several  query 
points  on  the  cost  ball’s  surface,  within  a  single  octant,  do  not  contribute  to  the  evasion 
algorithm’s  approximation  beyond  a  single  query’s  contribution.  In  the  improved  result, 
the  spherical  caps  need  not  be  aligned  according  to  the  coordinate  axes;  we  better  lower 
bound  the  number  of  queries  required  to  achieve  approximations.  It  is  thus  conceivable  that 
a  similar  argument  could  yield  negative  results  that  hold  for  all  constant  (wrt  D)  levels  of 
approximation  e,  for  all  p  >  1. 

3.4.2  Convex  Negative  Classes 

Algorithm  [^applies  to  all  costs  that  correspond  to  a  weighted  Lp  distance  centered  at  x^, 
for  p  >  1,  since  such  cost  functions  are  convex.  In  these  cases,  the  analogous  gradient  cuts 
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to  those  proposed  in  Section  |3.3.2|  are  valid  and  once  again  applicable  to  any  ball  of  smaller 
cost.  To  adapt  the  algorithm,  one  need  only  change  cost  function  A  and  the  separating 
hyperplane  used  for  the  halfspace  cut  in  Equation  (3.8). 

Moreover  SetSearch  is  applicable  for  any  convex  cost  function  A  so  long  as  we  can 
compute  the  separating  hyperplanes  of  any  sublevel  set|^  S'  of  A  for  any  point  y  ^  S'.  For 
general  convex  costs,  it  still  holds  that  the  sublevel  set  of  cost  C  (the  C-cost  ball  at  x^) 
is  contained  in  the  sublevel  set  of  cost  D  for  all  D  >  C.  As  a  consequence,  the  separating 
hyperplanes  for  sublevel  set  at  D  are  also  separating  hyperplanes  for  the  set  at  cost  C. 


3.5  Summary 


This  chapter  explores  the  evasion  problem  as  defined  by  e-IMAC  searchability:  using  a 
polynomial  number  of  membership  queries  to  a  fixed  but  unknown  classifier,  find  a  negative 
instance  that  almost-minimizes  cost.  This  work  generalizes  the  results  of  Lowd  and  Meek 


(2005b)  which  considers  how  best  to  launch  Exploratory  attacks  on  learning  systems,  he., 
how  to  submit  malicious  test  instances.  The  stated  goal  of  this  chapter  of  evasion  (False 


Negatives)  corresponds  to  Integrity  attacks  within  the  taxonomy  of  Barreno  et  al. 
as  described  in  Section  |1.2.2 


(|2006D 

However  the  methods  are  equally  applicable  to  Availability 
attacks  (that  cause  False  Positives). 

The  analysis  of  our  algorithms  shows  that  convex-inducing  classifiers  is  e-IMAC  search¬ 
able  for  Lp  costs,  for  p  <  1  in  the  convex  positive  class  case  and  for  p  >  1  in  the  convex 
negative  class  case.  We  observe  a  phase  transition  when  p  crosses  one:  for  convex  posi¬ 
tive  classes  evasion  to  minimize  Lp  cost  with  p  >  1  requires  exponential  queries.  When 
the  positive  class  is  convex  we  give  efficient  techniques  that  achieve  a  query  complexity 
of  O  flog(l/e)  -|-  \J\og{l/ e)r^ ,  outperforming  previous  reverse-engineering  approaches  for 


the  special  case  of  linear  classifiers.  We  provide  a  lower  bound  of  O  (log(l/e)  -|-  D)  on  the 
query  complexity  for  any  algorithm  evading  classifiers  with  convex  positive  class  under  the 
weighted  Li  cost,  showing  that  our  best  algorithm  is  at  least  within  a  ^J\og{l/e)  factor  to 
the  optimal  query  complexity.  We  also  consider  variants  of  the  search  procedure  for  when 
the  attacker  does  not  have  a  lower  bound  on  the  MAC  or  has  no  negative  instance  from 
which  to  begin  the  search. 

convex,  we  apply  the  randomized  Ellipsoid-based  method  of 
)  to  achieve  efficient  e-IMAC  search,  achieving  a  polynomial 
query  complexity  of  O*  \og{l / e))  while  minimizing  Lp  cost  for  p  >  1.  With  prior 

knowledge  that  a  learner  produces  classihers  with  a  specihc  class  being  convex,  an  adversary 
can  select  the  appropriate  query  algorithm  to  evade  detection.  If  the  adversary  is  unaware 
of  which  of  the  positive  and  negative  sets  are  convex,  but  has  prior  knowledge  that  one  is 
convex,  they  can  run  both  searches  concurrently  to  discover  an  e-IMAC  with  a  combined 
polynomial  query  complexity. 

Solutions  to  the  reverse  engineering  problem  are  clearly  sufficient  for  the  evasion  problem. 


When  the  negative  class  is 


Bertsimas  and  Vempala  (2004 


“The  sublevel  set  of  any  convex  function  is  a  convex  set  (Boyd  and  Vandenberghe ,  2004)  so  such  a 


separating  hyperplane  always  exists  but  may  not  be  easily  computed. 
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Lowd  and  Meek  (2005b)  used  a  reverse  engineer  approach  to  evading  linear  classification 


without  significant  cost  to  query  complexity.  An  important  consequence  of  the  polynomial 
evadability  of  convex-inducing  classifiers  is  that  for  this  class  of  classifiers,  reverse  engineer¬ 
ing  is  not  necessary  for  evasion  of  convex-inducing  classifiers  and  in  fact  can  be  significantly 
harder. 

Exploring  near-optimal  evasion  is  important  for  understanding  how  an  adversary  may 
circumvent  learners  in  security-sensitive  settings.  As  described  here,  our  algorithms  may  not 
always  directly  apply  in  practice  since  various  real-world  obstacles  persist.  Queries  may  be 
only  partially  observable  or  noisy  and  the  feature  set  may  be  only  partially  known.  Moreover, 
an  adversary  may  not  be  able  to  query  all  x  G  A.  Queries  are  almost  always  objects  (such 
as  email  messages)  that  are  mapped  into  X  by  the  adaptive  system.  A  real-world  adversary 
must  invert  the  feature-mapping — a  generally  difficult  task.  These  limitations  necessitate 
further  research  on  the  impact  of  partial  observability  and  approximate  querying  on  e-IMAC 
search,  and  to  design  more  secure  filters. 


Chapter  4 

Privacy-Preserving  Learning 


You  have  zero  privacy  anyway.  Get  over  it. 
-  Scott  McNealy,  Co-Founder,  Sun  Microsystems 


Privacy-preserving  learning  is  a  relatively  young  field  in  the  intersection  of  Security, 
Database,  TCS,  Statistics  and  Machine  Learning  research.  The  broad  goal  of  research  into 
privacy-preserving  learning  is  to  release  aggregate  statistics  on  a  dataset  without  disclosing 
local  information  about  individual  elements  of  the  data. 

Several  recent  studies  in  privacy-preserving  learning  have  considered  the  trade-off  be¬ 
tween  utility  or  risk  and  the  level  of  differential  privacy  guaranteed  by  mechanisms  for  sta¬ 
tistical  query  processing.  In  this  chapter  we  study  this  trade-off  in  private  Support  Vector 
Machine  (SVM)  learning.  We  present  two  efficient  mechanisms,  one  for  the  case  of  hnite- 
dimensional  feature  mappings  and  one  for  potentially  inhnite-dimensional  feature  mappings 
with  translation-invariant  kernels.  For  the  case  of  translation-invariant  kernels,  the  pro¬ 
posed  mechanism  minimizes  regularized  empirical  risk  in  a  random  Reproducing  Kernel 
Hilbert  Space  whose  kernel  uniformly  approximates  the  desired  kernel  with  high  probabil¬ 
ity.  This  technique,  borrowed  from  large-scale  learning,  allows  the  mechanism  to  respond 
with  a  hnite  encoding  of  the  classiher,  even  when  the  function  class  is  of  inhnite  Vapnik- 
Chervonenkis  (VC)  dimension.  Differential  privacy  is  established  using  a  proof  technique 
from  algorithmic  stability.  Utility — the  mechanism’s  response  function  is  pointwise  e-close 
to  non-private  SVM  with  probability  (1  —  6) — is  proven  by  appealing  to  the  smoothness  of 
regularized  empirical  risk  minimization  with  respect  to  small  perturbations  to  the  feature 
mapping.  We  conclude  with  a  lower  bound  on  the  optimal  differential  privacy  of  the  SVM. 
This  negative  result  states  that  for  any  6,  no  mechanism  can  be  simultaneously  (e,  (5)-useful 
and  /9-differentially  private  for  small  e  and  small  jS. 


In  the  language  of  the  taxonomy  of  Barreno  et  ah  (2006),  the  privacy-preserving  mecha¬ 


nisms  developed  in  this  chapter  are  robust  to  both  Exploratory  and  Causative  attacks,  which 
aim  to  violate  Conhdentiality.  An  attacker  with  access  to  the  released  classiher  can  probe 
it  in  an  attempt  to  reveal  information  about  the  training  data;  moreover  an  attacker  with 
inhuence  over  (up  to)  all  but  one  example  of  the  training  data  may  attempt  to  manipulate 
the  mechanism  into  revealing  information  about  the  unknown  training  example.  In  both 
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cases  our  strong  guarantees  of  differential  privacy  prove  that  such  attacks  will  fail.  Finally, 
our  analysis  considers  adversaries  with  near-complete  knowledge  of  the  training  data  (n  —  1 
out  of  n  examples),  complete  knowledge  about  the  mechanism  up  to  sources  of  randomness, 
and  access  to  the  released  classiher  trained  on  the  data.  Similarly,  the  attacker  may  have 
complete  control  over  the  known  subset  of  training  data.  In  sum,  we  allow  for  substantial 
levels  of  adversarial  information  and  control. 


4.1  Introduction 


The  goal  of  a  well-designed  statistical  database  is  to  provide  aggregate  information  about 
a  database’s  entries  while  maintaining  individual  entries’  privacy.  These  two  goals  of  utility 
and  privacy  are  inherently  discordant.  For  a  mechanism  to  be  useful,  its  responses  must 
closely  resemble  some  target  statistic  of  the  database’s  entries.  However  to  protect  privacy, 
it  is  often  necessary  for  the  mechanism’s  response  distribution  to  be  ‘smoothed  out’,  he., 
the  mechanism  must  be  randomized  to  reduce  the  individual  entries’  influence  on  this  dis¬ 
tribution.  It  has  been  of  key  interest  to  the  statistical  database  community  to  understand 


when  the  goals  of  utility  and  privacy  can  be  efficiently  achieved  simultaneously  (Barak  et  al. 
2007t  Blum  et  ah,  2008t  Chaudhuri  and  Monteleoni,  2009  Dinur  and  Nissim,  2003[  Dwork 


et  ah,  2007[  Kasiviswanathan  et  ah,  2008).  In  this  chapter  we  consider  the  practical  goal 
of  private  regularized  empirical  risk  minimization  (ERM)  in  Reproducing  Kernel  Hilbert 
Spaces  for  the  special  case  of  the  Support  Vector  Machine  (SVM).  We  adopt  the  strong 


notion  of  differential  privacy  as  formalized  by  Dwork  (2006).  Our  efficient  new  mechanisms 


are  shown  to  parametrize  functions  that  are  close  to  non-private  SVM  under  the  Loo-norm, 
with  high  probability.  In  our  setting  this  notion  of  utility  is  stronger  than  closeness  of  risk 
(c/.  Remark  28). 

We  employ  a  number  of  algorithmic  and  proof  techniques  new  to  differential  privacy.  One 
of  our  new  mechanisms  borrows  a  technique  from  large-scale  learning,  in  which  regularized 
ERM  is  performed  in  a  random  feature  space  whose  inner-product  uniformly  approximates 
the  target  feature  space  inner-product.  This  random  feature  space  is  constructed  by  viewing 
the  target  kernel  as  a  probability  measure  in  the  Fourier  domain.  This  technique  enables 
the  hnite  parametrization  of  responses  from  function  classes  with  inhnite  VC  dimension. 
To  establish  utility,  we  show  that  regularized  ERM  is  relatively  insensitive  to  perturbations 
of  the  kernel;  not  only  does  the  technique  of  learning  in  a  random  RKHS  enable  hnitely- 
encoded  privacy-preserving  responses,  but  these  responses  well-approximate  the  responses 
of  non-private  SVM.  Together  these  two  techniques  may  prove  useful  in  extending  privacy¬ 
preserving  mechanisms  to  learn  in  large  function  spaces.  To  prove  differential  privacy,  we 
borrow  a  proof  technique  from  the  area  of  algorithmic  stability.  We  believe  that  stability 
may  become  a  fruitful  avenue  for  constructing  new  private  mechanisms  in  the  future,  based 
on  learning  maps  presently  known  to  be  stable. 

Of  particular  interest,  is  the  optimal  differential  privacy  of  the  SVM,  which  loosely 
speaking  is  the  best  level  of  privacy  achievable  by  any  accurate  mechanism  for  SVM  learn¬ 
ing.  Through  our  privacy-preserving  mechanisms  for  the  SVM,  endowed  with  guarantees 
of  utility,  we  upper  bound  optimal  differential  privacy.  We  also  provide  lower  bounds  on 
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the  SVM’s  optimal  differential  privacy,  which  are  impossibility  results  for  simultaneously 
achieving  high  levels  of  utility  and  privacy. 


An  earlier  version  of  this  chapter  appeared  as  a  technical  report  (Rubinstein  et  al. 


2009).  Subsequently,  but  independently,  Sarwate  et  al.  (2009)  recently  considered  privacy¬ 


preserving  mechanisms  for  SVM  learning.  Their  mechanism  for  linear  SVM  guarantees 
differential  privacy  by  adding  a  random  term  to  the  objective  as  they  pioneered  for  regular¬ 


ized  logistic  regression  (Chaudhuri  and  Monteleoni,  2009).  For  non-linear  SVM  the  authors 


exploit  the  same  technique  from  large-scale  learning  we  use  here.  The  authors  also  develop 
a  method  for  tuning  the  regularization  parameter  while  preserving  privacy,  using  a  com¬ 


parison  procedure  due  to  McSherry  and  Talwar  (2007).  It  is  noteworthy  that  preserving 


privacy  via  the  randomized  objective  applies  only  to  differentiable  loss  functions,  ruling  out 
the  important  case  of  hinge-loss.  Our  mechanisms  preserve  privacy  for  any  convex  loss.  And 


while  Sarwate  et  al.  (2009)  prove  risk  bounds  for  their  mechanisms,  our  utility  guarantees 
are  strictly  stronger  for  hinge-loss  (c/.  Remark  [2^  and  we  provide  lower  bounds  on  simulta¬ 
neously  achievable  utility  and  privacy.  Finally  our  proof  of  differential  privacy  is  interesting 
due  to  its  novel  use  of  stability. 


Chapter  Organization.  The  remainder  of  this  chapter  is  organized  as  follows.  After 
concluding  this  section  with  a  summary  of  related  work,  we  recall  basic  concepts  of  dif¬ 


ferential  privacy  and  SVM  learning  in  Section  4.2  Sections  4.3  and  4.4  describe  the  new 


mechanisms  for  private  SVM  learning  for  finite-dimensional  feature  maps  and  (potentially 
infinite-dimensional)  feature  maps  with  translation-invariant  kernels.  Each  mechanism  is 
accompanied  with  proofs  of  privacy  and  utility  bounds.  Section  4A  considers  the  special 
case  of  hinge-loss  and  presents  an  upper  bound  on  the  SVM’s  optimal  differential  privacy. 
A  corresponding  lower  bound  is  then  given  in  Section  4.6  We  conclude  the  chapter  with  a 
summary  of  our  contributions. 


4.1.1  Related  Work 

There  is  a  rich  literature  of  prior  work  on  differential  privacy  in  the  theory  community. 
The  following  sections  summarize  work  related  to  our  own,  organized  to  contrast  this  work 
with  our  main  contributions. 


Range  Spaces  Parametrizing  Vector- Valued  Statistics  or  Simple  Functions.  Early 
work  on  private  interactive  mechanisms  focused  on  approximating  real-  and  vector-valued 
statistics 


.e.g., 


Barak  et  al. 

2007; 

Blum  et  al.j 

2(1(1-'. 

Dwork  et  al.  2006).  McSherry  and  Talwar  (2007)  first  considered  private  mechanisms  with 


Dinur  and  Nissim  2003  Dwork  2006 


range  spaces  parametrizing  sets  more  general  than  real-valued  vectors,  and  used  such  dif¬ 
ferentially  private  mappings  for  mechanism  design.  More  related  to  our  work  are  the  pri¬ 


vate  mechanisms  for  regularized  logistic  regression  proposed  and  analyzed  by  Chaudhuri 


and  Monteleoni  (2009).  There  the  mechanism’s  range  space  parametrizes  the  VC-dimension 

Kasiviswanathan  et  al.  (2008)  showed  that  discretized 


d+1  class  of  linear  hyperplanes  in  ! 
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concept  classes  can  be  PAC  learned  or  agnostically  learned  privately,  albeit  via  an  ineffi¬ 
cient  mechanism.  Blnm  et  al.  (2008)  showed  that  non-interactive  mechanisms  can  privately 


release  anonymized  data  snch  that  utility  is  guaranteed  over  classes  of  predicate  queries 


with  polynomial  VC  dimension,  when  the  domain  is  discretized.  Dwork  et  al.  (2009)  more 


recently  characterized  when  utility  and  privacy  can  be  achieved  by  efficient  non-interactive 
mechanisms.  In  this  paper  we  consider  efficient  mechanisms  for  private  SVM  learning, 
whose  range  spaces  parametrize  real- valued  functions  (whose  sign  form  trained  classifiers). 
One  case  covered  by  our  analysis  is  learning  with  a  Gaussian  kernel,  which  corresponds  to 
learning  over  a  rich  class  of  inhnite  dimension. 

Practical  Privacy-Preserving  Learning  (Mostly)  via  Subset-Sums.  Most  prior 
work  in  differential  privacy  has  focused  on  the  deep  analysis  of  mechanisms  for  relatively 
simple  statistics  (with  histograms  and  contingency  tables  as  explored  by  [Blum  et  al.  2005 
and  Barak  et  al.  2007]  respectively,  as  examples)  and  learning  algorithms  {e.g.,  interval 
queries  and  half-spaces  as  explored  by  Blum  et  al.  2008),  or  on  constructing  learning  algo¬ 
rithms  that  can  be  decomposed  into  subset-sum  operations  {e.g.,  perceptron,  fc-NN,  IDS 
as  described  by  [Blum  et  al.  [2005,  and  various  recommender  systems  due  to  the  work  of 
McSherry  and  Mironov||2009).  By  contrast,  we  consider  the  practical  goal  of  SVM  learning, 
which  does  not  decompose  into  subset-sums.  It  is  also  notable  that  our  mechanisms  run  in 


polynomial  time.  The  most  related  work  to  our  own  in  this  regard  is  due  to  Chaudhuri  and 


Monteleoni  (2009),  although  their  results  hold  only  for  differentiable  loss,  and  finite  feature 


mappings. 


The  Privacy-Utility  Trade-Off.  Like  several  prior  studies,  we  consider  the  trade-off 


between  privacy  and  utility.  Barak  et  al.  (2007)  presented  a  mechanism  for  releasing  contin¬ 


gency  tables  that  guarantees  differential  privacy  and  also  guarantees  a  notion  of  accuracy: 
with  high  probability  all  marginals  from  the  released  table  are  close  in  Li-norm  to  the 
true  table’s  marginals.  As  mentioned  above,  Blum  et  al.  (2008)  developed  a  private  non¬ 


interactive  mechanism  that  releases  anonymized  data  such  that  all  predicate  queries  in  a 
VC-class  take  on  similar  values  on  the  anonymized  data  and  original  data.  In  the  work  of 


Kasiviswanathan  et  al.  (2008),  utility  corresponds  to  PAC  learning:  with  high  probability 


the  response  and  target  concepts  are  close,  averaged  over  the  underlying  measure. 

A  sequence  of  prior  negative  results  have  shown  that  any  mechanism  providing  overly 


accurate  responses  cannot  be  private  ( 

Dinur  and  Nissim 

2003 

Dwork  and  Yekhanin ,  2008 

Dwork  et  al. 

2007 

)• 

Dinur  and  Nissim 

(2003 

)  showed  that  if  noise  of  rate  only  o{y/n)  is 

added  to  subset  sum  queries  on  a  database  of  bits  then  an  adversary  can  reconstruct  a 
1  —  o(l)  fraction  of  the  database.  This  is  a  threshold  phenomenon  that  says  if  accuracy 
is  too  high,  privacy  cannot  be  guaranteed  at  all.  This  result  was  more  recently  extended 


to  allow  for  mechanisms  that  answer  a  small  fraction  of  queries  arbitrarily  (Dwork  et  al. 


2007).  We  show  a  similar  negative  result  for  the  private  SVM  setting:  any  mechanism  that 


is  too  accurate  with  respect  to  the  SVM  cannot  guarantee  strong  levels  of  privacy. 
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Connections  between  Stability,  Robnst  Statistics,  and  Global  Sensitivity.  To 

prove  differential  privacy,  we  borrow  a  proof  technique  from  the  area  of  algorithmic  stability. 
In  passing  Kasiviswanathan  et  ah  (2008)  note  the  similarity  between  notions  of  algorithmic 
stability  and  differential  privacy,  however  do  not  exploit  this.  The  connection  between 
algorithmic  stability  and  differential  privacy  is  qualitatively  similar  to  the  recent  work  of 
Dwork  and  Lei[  (2009)  who  demonstrated  that  robust  estimators  can  serve  as  the  basis  for 


private  mechanisms,  by  exploiting  the  limited  influence  of  outliers  on  such  estimators. 


4.2  Background  and  Definitions 


A  database  D  is  a.  sequence  of  n  >  1  entries  or  rows  (xj,?/*)  G  x  {—1, 1},  which  are 
input  point-label  pairs  or  examples.  We  say  that  a  pair  of  databases  Di,  D2  are  neighbors  if 
they  differ  on  one  entry.  A  mechanism  M  is  a  service  trusted  with  access  to  a  database  D, 
that  releases  aggregate  information  about  D  while  maintaining  privacy  of  individual  entries. 
By  M{D)  we  mean  the  response  of  M  on  D.  We  assume  that  this  is  the  only  information 
released  by  the  mechanism.  Denote  the  range  space  of  M  by  7m-  We  adopt  the  following 
strong  notion  of  differential  privacy  due  to  Dwork  (2006). 


Definition  26.  For  any  /3  >  0,  a  randomized  mechanism  M  provides  /^-differential  privacy, 
if,  for  all  neighboring  databases  Di,D2  and  all  responses  t  G  7m  the  mechanism  satisfies 


log 


/Pr(M(7}i)  =t)\ 
[FT{M{D2)=t)) 


<  - 


The  probability  in  the  dehnition  is  over  the  randomization  in  M.  For  continuous  7m  we 
mean  by  this  ratio  a  Radon-Nikodym  derivative  of  the  distribution  of  M{Di)  with  respect 
to  the  distribution  of  M{D2).  If  an  adversary  knows  M  and  the  hrst  n  —  1  entries  of  D, 
she  may  simulate  the  mechanism  with  different  choices  for  the  missing  example.  If  the 
mechanism’s  response  distribution  varies  smoothly  with  her  choice,  the  adversary  will  not 
be  able  to  infer  the  true  value  of  entry  n  by  querying  M.  In  the  sequel  we  assume  WLOG 
that  each  pair  of  neighboring  databases  differ  on  their  last  entry. 

Intuitively  the  more  an  ‘interesting’  mechanism  M  is  perturbed  to  guarantee  differen¬ 
tial  privacy,  the  less  like  M  the  resulting  mechanism  M  will  become.  The  next  dehnition 
formalizes  the  notion  of  ‘likeness’. 


Definition  27.  Consider  two  mechanisms  M  and  M  with  the  same  domain  and  with  re¬ 
sponse  spaces  and  Tm-  Let  X  be  some  set  and  let  X  be  a  space  of  real-valued  functions 
on  X  that  is  parametrized  by  the  response  spaces:  for  every  t  G  U  Tm  let  ft  &  T  be  some 
function.  Finally  assume  T  is  endowed  with  norm  ||  •  ||  j-.  Then  for  e  >  0  and  0  <  6  <  1  we 
say  tha^  M  is  (e,  5)-useful  with  respect  to  M  if,  for  all  databases  D, 


P^{\\fM{D)- fM(D)h  <e)  >  5 


^We  are  overloading  the  term  ‘(cj  i5)-usefulness’  introduced  by  Blum  et  al.  (2008)  for  non-interactive 
mechanisms.  Our  definition  is  analogous  for  the  present  setting  of  privacy-preserving  learning,  where  a 
single  function  is  released. 
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Algorithm  10  SVM 

Inputs:  database  D  =  {(x*,  with  Xj  G  Ui  G  {  —  1, 1};  kernel  k  :  x  — >■  M; 

convex  loss  function  i]  parameter  C  >  0. 

1.  CK*  ^  Solve  the  SVM’s  dual  QP;  and 

2.  Return  vector  a*. 


Typically  M  will  be  a  privacy-preserving  version  of  M,  that  has  been  perturbed  somehow. 
In  the  sequel  we  will  take  ||  •  ||j-  to  be  the  sup-norm  over  a  subset  A1  C  containing  the 
data,  which  we  denote  by  ||/||oo;X  =  supxg_A4  |/(x)|.  It  will  also  be  convenient  to  use  the 
notation  ||A:||oo;X  =  supx_yg_A4  |fc(x, y)|  for  bivariate  functions  /c(-,  ■). 


Remark  28.  In  the  sequel  we  develop  privacy-preserving  mechanisms  that  are  useful  with 
respect  to  the  Support  Vector  Machine  (see  the  next  section  for  a  brief  introduction  to  the 
SVM).  The  SVM  works  to  minimize  the  expected  hinge-loss  (i.e.,  risk  in  terms  of  the  hinge- 
loss),  which  is  a  convex  surrogate  for  the  expected  0-1  loss.  Since  the  hinge-loss  is  Lipschitz 
in  the  real-valued  function  output  by  the  SVM,  it  follows  that  a  mechanism  M  having  utility 
with  respect  to  the  SVM  also  has  expected  hinge-loss  that  is  within  e  of  the  SVM’s  hinge- 
loss  with  high  probability.  That  is,  {e,  6) -usefulness  with  respect  to  the  sup-norm  is  stronger 
than  guaranteed  closeness  of  risk  We  consider  the  hinge-loss  further  in  Sections  4^  and  f.O. 
Until  then  we  work  with  arbitrary  convex,  Lipschitz  losses. 


We  will  see  that  the  presented  analysis  does  not  simultaneously  guarantee  privacy  at 
arbitrary  levels  and  utility  at  arbitrary  accuracy.  The  highest  level  of  privacy  guaranteed 
over  all  (e,  (5)-useful  mechanisms  with  respect  to  a  target  mechanism  M,  is  quantihed  by 
the  optimal  differential  privacy  for  M.  We  define  this  notion  for  the  SVM  here,  but  the 
concept  extends  to  any  target  mechanism  of  interest.  We  present  upper  and  lower  bounds 
on  (3{e,  6,  C,  n,  i,  k)  for  the  SVM  in  Sections  4.5  and  4.6  respectively. 


Definition  29.  For  e,  C  >  0,  5  G  (0, 1),  n  >  1,  loss  function  i{y,  y)  convex  in  y,  and  kernel 
k,  the  optimal  differential  privacy  for  the  SVM  is  the  function 


(3{e,  6,  C,  n,  i,  k) 


inf  sup  sup  log 

Mel  (Di,D2)gX> 


Pr  (m(Di)  =  \ 

Pr  (m(D2)  =  t)  y 


where  X  is  the  set  of  all  (e,  5) -useful  mechanisms  with  respect  to  the  SVM  with  parameter  C, 
loss  i,  and  kernel  k;  and  V  is  the  set  of  all  pairs  of  neighboring  databases  with  n  entries. 


4.2.1  Support  Vector  Machines 

Soft-margin  SVM  has  convex  Primal  program  min.^^  GR^  il|w||i  +  f 
where  the  x*  G  are  training  input  points  and  the  yi  G  {  —  1, 1}  are  their  training  labels,  n 
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Kernel 

^/(A) 

p{u) 

RBF 

exp( 

12V  “p(  V'd 

Laplacian 

exp(-||A||i) 

TP 

1  Vi—l 

Cauchy 

fP  2 

1  li=l  l+A^ 

exp(-||A||i) 

Table  4.1:  Example  translation-invariant  kernels,  their  g  functions  and  the  corresponding 
Fourier  transforms. 


is  the  sample  size, 


-)■ 


is  a  feature  mapping  taking  points  in  input  space  to  some 


(possibly  inhnite)  F-dimensional  feature  space,  i{y,  y)  is  a  loss  function  convex  in  y,  and  w 
is  a  hyperplane  normal  vector  in  feature  space  (iBishopl  120061  iBurgesl  119981  iCristianini  and 


Shawe- Taylor 

2000: 

Scholkopf  and  Smola 

2001 

For  hnite  F,  predictions  are  taken  as  the  sign  of  /*(x)  =  /w*(x)  =  (0(x),w*).  We 
will  refer  to  both  /w(-)  and  sgn(/w(-))  as  classifiers,  with  the  exact  meaning  apparent 
from  the  context.  When  F  is  large  the  solution  may  be  more  easily  obtained  via  the  dual. 
For  example,  see  Program  (4.9)  in  Section  4.5  for  the  dual  formulation  of  the  hinge-loss 
KUyy)  =  (1  ~yy)+j  which  is  the  loss  most  commonly  associated  with  soft-margin  SVM.  The 
vector  of  maximizing  dual  variables  ck*  returned  by  dualized  SVM  parametrizes  the  function 
/*  =  fa*  as  fa{-)  =  where  /c(x,y)  =  (0(x),0(y))  is  the  kernel  function. 

Translation-invariant  kernels  are  an  important  class  of  kernel  given  by  functions  with  the 
form  /c(x, y)  =  g{'x  —  y)  (see  Table  4.1  for  examples).  We  dehne  the  mechanism  SVM  to 
be  the  dual  optimization  that  responds  with  the  vector  ex*,  as  described  by  Algorithm [T0| 


4.3  Mechanism  for  Finite  Feature  Maps 


As  a  hrst  step  towards  private  SVM  learning  we  begin  by  considering  the  simple  case 


of  hnite  F-dimensional  feature  maps.  Algorithm  11  describes  the  PrivateSVM-Finite 


mechanism,  which  follows  the  established  pattern  of  preserving  differential  privacy:  after 
forming  the  primal  solution  to  the  SVM — an  F-dimensional  vector — the  mechanism  adds 
Laplace-distributed  noise  to  the  weight  vector.  Guaranteeing  differential  privacy  proceeds 
via  the  established  two-step  process  of  calculating  the  Li-sensitivity  of  the  SVM’s  weight 
vector,  then  showing  that  /^-differential  privacy  follows  from  sensitivity  together  with  the 
choice  of  Laplace  noise  with  scale  equal  to  sensitivity  divided  by  f3. 

To  calculate  sensitivity,  we  exploit  the  algorithmic  stability  of  regularized  ERM.  Intu¬ 
itively,  stability  corresponds  to  continuity  of  a  learning  map.  Several  notions  of  stability 


are  known  to  lead  to  good  generalization  error  bounds  (Bousquet  and  Elisseeff,  2002  De- 


vroye  and  Wagner,  1979;  Kearns  and  Ron,  1999  Kutin  and  Niyogi,  2002),  sometimes  in 
cases  where  class  capacity-based  approaches  such  as  VC  theory  do  not  apply.  A  learning 
map  M  is  a  function  that  maps  a  database  H  to  a  classiher  fn]  it  is  precisely  the  com¬ 
position  of  a  mechanism  followed  by  the  classiher  parametrization  mapping.  A  learning 
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Algorithm  11  PrivateSVM-Finite 

Inputs:  database  D  =  {(x*,  with  Xj  G  yi  G  {  —  1, 1};  finite  feature  map 

0  :  — )■  and  induced  kernel  fc;  convex  loss  function  i]  and  parameters  A,  C  >  0. 


1. 

2. 

3. 

4. 


CK*  ^  Run  Algorithm  10  on  D  with  parameter  C,  kernel  k,  and  loss  i] 

fj,  Draw  i.i.d.  sample  of  F  scalars  from  Laplace  (0,  A);  and 
Return  w  =  w  +  p 


map  A  is  said  to  have  'y-uniform  stability  with  respect  to  loss  i{-,  •)  if  for  all  neighboring 
databases  D,D',  the  losses  of  the  classifiers  trained  on  D  and  D'  are  close  on  all  test  ex¬ 
amples  ||£(-,  A(-D))  —  £(-,  A(-D'))||oo  <  7  (Bousquet  and  Elisseeff,  2002).  Our  first  lemma 


computes  sensitivity  of  the  SVM’s  weight  vector  for  general  convex,  Lipschitz  loss  by  fol¬ 
lowing  the  proof  of  Scholkopf  and  Smola  (2001|  Theorem  12.4)  which  establishes  that  SVM 
learning  has  uniform  stability  (a  result  due  to  Bousquet  and  Elisseeff  2002). 


Lemma  30.  Consider  loss  function  i{y,y)  that  is  convex  and  L-Lipschitz  in  y,  and  RKHS 
Ti  induced  by  finite  F-dimensional  feature  mapping  cf  with  bounded  kernel  /c(x,  x)  <  lA 
for  all  X  G  Let  G  be  the  minimizer  of  the  following  regularized  empirical  risk 
function  for  each  database  S  =  {(xj, 


C  1 

Rreg(w,R)  =  —  ^£(|/i,/w(Xi))  + -||w| 

i=l 


Then  for  every  pair  of  neighboring  databases  D,D'  ofn  entries,  ||w£)— W£)/||i  <  ALCny/F/n. 


Proof.  We  now  calculate  the  sensitivity  of  the  SVM  primal  weight  vector  for  general  convex, 
Lipschitz  loss  functions.  For  convenience  we  define  Remp(w,  S')  =  ri~^Y^^=i^{yiy  for 

any  training  set  S',  then  the  first-order  necessary  KKT  conditions  imply 


0  G  5wRreg(w/5,L))  =  CawRemp(wD,L))  -FWd  ,  (4.1) 

0  G  5wRreg(w£,/,L>')  =  C'5wRemp(wD/,L>')  .  (4.2) 

where  is  the  subdifferential  operator  with  respect  to  w.  Define  the  auxiliary  risk  function 

R(w)  =  C{5w-Remp(w£,,D)  -  c}w-Remp(w£,/,D'),  w  -  Wi)/)  -F  ^||w  -  W^/||2  . 

It  is  easy  to  see  that  R{w)  is  strictly  convex  in  w  and  that  R{wo>)  =  {0}.  And  by 
Equation  (4.2) 

C'^wRemp(w£),  D)  -|-  W  G  C  D 1  F'j  C  D'  i  -^  )  4”  ^  ^  D' 

=  awR(w)  , 
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which  combined  with  Equation  (4.1)  implies  0  G  9w-R(w£)),  so  that  R{w)  is  minimized  at 
W£).  Thus  there  exists  some  non-positive  r  G  R{w£)).  Next  simplify  the  hrst  term  of  R{w£,), 
scaled  by  n/C  for  notational  convenience: 

n{d^Remp{^D,  D)  -  9wi?emp(w£,/,  D'),  W  -  W^/) 

n 

=  iVi,  /wo(xi))  -  d^i  {y-,  /w3,(x'))  ,  w  -  wd’) 

i=l 

n—1 

=  /wo(xi))  -  (yi,  -  U^,(xi)) 

i=l 

+  (Pn,  UnM)  (/wo(x„)  -  /wo,(x„))  -  i'  {y^,  {UoM  “ 

>  C  (l/n,  Uo  (Xn))  {Un  M  “  /w^,  (x„))  -  i'  (t/^,  /w^,  (x'^))  (/w^  (x^)  -  (x'„))  , 

where  the  second  equality  follows  from  d^i  {y,  /w(x))  =  i'  {y,  /w(x))  0(x),  where  i'{y,y)  = 
dy£{y,y),  and  x'  =  Xj  and  ?/'  =  yi  for  each  i  G  [u  —  1],  The  inequality  follows  from 
the  convexity  of  i  in  its  second  argument]^  Combined  with  the  existence  of  non-positive 
r  G  Riyfij)  this  yields  that  there  exists 

g  e  £'  {y'^,  /w^,(x'„))  {UoM  -  Uo'i^n))  -  iVn,  /wo(Xn))  {UnM  “ 

such  that 

n 

0  >  gr 

Tt 

>  9  +  ^\Wd-^d'\\1  ■ 

And  since  \g\  <  2L  ||/w£,  —  /w£,/||g^  by  the  Lipschitz  continuity  of  this  in  turn  implies 

(4.3) 


n 


-^D'Wl  <  2L\\U^-U^, 


D'  M  oo 


Now  by  the  reproducing  property  and  Cauchy-Schwartz  inequality  we  can  upper  bound  the 
classiher  difference’s  inhnity  norm  by  the  Euclidean  norm  on  the  weight  vectors:  for  each  x 

|/wo(x)  -  /wo,(x)|  =  \{(j){x),WD 

<  II^WIIallwD -Wi^'ila 

=  y/fc(x,  x)  ||W£,  -  w^/||2 

<  K  ||w£)  —  W£)/||2  . 


Combining  this  with  Inequality  (4.3)  yields  \\w£)  —  W£)/||2  <  ALCn/n  as  claimed.  The 
Li-based  sensitivity  then  follows  from  ||w  111  <  V^||w||2  for  all  w  G  M^.  □ 

With  the  weight  vector’s  sensitivity  in  hand,  differential  privacy  follows  immediately 
from  the  proof  technique  established  by  Dwork  et  al.  (2006). 


^Namely  for  convex  /  and  any  a,  6  €  K,  {ga  —  gb)  {a  —  b)  >  0  for  all  ga  G  df{a)  and  all  gb  €  df{b). 
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Theorem  31  (Privacy  of  Privates VM-Finite).  For  any  (3  >  Q,  database  D  of  size  n, 
C  >  0,  loss  function  i{y,y)  that  is  convex  and  L-Lipschitz  in  y,  and  finite  F -dimensional 
feature  map  with  kernel  k{'K,x.)  <  for  all  x  G  PrivateSVM-Finite  run  on  D 
with  loss  i,  kernel  k,  noise  parameter  A  >  ALCk,\^F /{jdn)  and  regularization  parameter  C 
guarantees  (3 -differential  privacy. 

This  first  main  result  establishes  differential  privacy  of  the  new  PrivateSVM-Finite 
algorithm.  The  more  “private”  the  data,  the  more  noise  must  be  added.  The  more  entries 
in  the  database,  the  less  noise  is  needed  to  achieve  the  same  level  of  privacy.  Since  the 
noise  vector  pL  has  exponential  tails,  standard  tail  bound  inequalities  quickly  lead  to  (e,  5)- 
usefulness  for  PrivateSVM-Finite. 


Theorem  32  (Utility  of  PrivateSVM-Finite).  Consider  any  C  >  D,  n  >  1,  database 
D  of  n  entries,  arbitrary  convex  loss  and  finite  F -dimensional  feature  mapping  with 
kernel  k  and  |0(x)j|  <  $  for  a//  x  G  A1  and  i  G  [F]  for  some  $  >  0  and  M  C  For  any 
e  >  0  and  6  G  (0, 1),  PRIVATES VM-Finite  run  on  D  with  loss  I,  kernel  k,  noise  parameter 
0  <  A  <  rTTTv — Vr - TT,  and  regularization  parameter  C,  is  (e,  6) -useful  with  respect  to  the 


2$(Flog^2+log^  j) 

SVM  under  the  ||  •  \\oo-M-'aorm 


Proof.  Our  goal  is  to  compare  the  SVM  and  PrivateSVM-Finite  classifications  of  any 
point  X  E  M: 


fM(D)(x) 


|(w,0(x))  -  (w,0(x))| 


=  l(^,0W)| 

<  ll^lli  il0(x)IL 

<  ^IImIIi  • 


The  absolute  value  of  a  zero  mean  Laplace  random  variable  with  scale  parameter  A  is 
exponentially  distributed  with  scale  A“^.  Moreover  the  sum  of  q  i.i.d.  exponential  random 
variables  has  Erlang  g-distribution  with  the  same  scale  parameter]^  Thus  we  have,  for 
Erlang  F-distributed  random  variable  X  and  any  f  >  0, 


Vx  G  VI, 


/m(d)(^)  /m(d)(x) 


<  <f)V 


Ve  >  0,  Pr 


/. 


M(D) 


fM(D) 


>  e 


oo-,M 


<  Pr  (V  >  e/$) 
E  [e*^] 


< 


(4.4) 


Here  we  have  employed  the  standard  Chernoff  tail  bound  technique  using  Markov’s  inequal¬ 
ity.  The  numerator  of  (4.4),  the  moment  generating  function  of  the  Erlang  F-distribution 


^The  Erlang  g-distribution  has  density  - — CDF  1  —  e  “F  S|=o  expectation  qX,  vari¬ 

ance  gA^. 
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with  parameter  A,  is  (1  —  Xt)  ^  for  all  t  <  A 
gives 


Together  with  the  choice  of  t  =  (2A)  this 


Pr 


/. 


M{D) 


fM(D) 


oo;J\4 


^  2F^-e/i2X'S>) 

=  exp(Flog,2-- 


And  provided  that  A  <  e/  (2<h  (Flogg  2  +  logg  |))  this  probability  is  bounded  by  5.  □ 

Our  second  main  result  establishes  that  PrivateSVM-Finite  is  not  only  differentially 
private,  but  that  it  releases  a  classifier  that  is  similar  to  the  SVM.  Utility  and  privacy  are 
competing  properties,  however,  since  utility  demands  that  the  noise  not  be  too  large. 


4.4  Mechanism  for  Translation-Invariant  Kernels 

Consider  now  the  problem  of  privately  learning  in  an  RKHS  Ti  induced  by  an  inhnite 
dimensional  feature  mapping  0.  As  a  mechanism’s  response  must  be  finitely  encodable,  the 
primal  parametrization  seems  less  appealing  as  for  PrivateSVM-Finite.  It  is  natural  to 


look  to  the  SVM’s  dual  solution  as  a  starting  point:  the  Representer  Theorem  (Kimeldorf 
and  Wahba,  1971)  states  that  the  optimizing  /*  G  77  must  be  in  the  span  of  the  data — a 
finite-dimensional  subspace.  While  the  coordinates  in  this  subspace — the  a*  dual  variables — 
could  be  perturbed  in  the  usual  way  to  guarantee  differential  privacy,  the  subspace’s  basis — 
the  data — are  also  needed  to  parametrize  /*.  To  side-step  this  apparent  stumbling  block, 
we  take  another  approach  by  approximating  77  with  a  random  RKHS  77  induced  by  a 
random  finite-dimensional  map  0.  This  then  allows  us  to  respond  with  a  finite  primal 
parametrization.  Algorithm  [T^  summarizes  the  Privates VM  mechanism. 

As  noted  recently  by  Rahimi  and  Recht  (2008),  the  Fourier  transform  p  of  the  g  func¬ 


tion  of  a  continuous  positive-definite  translation-invariant  kernel  is  a  non-negative  mea¬ 
sure  (Rudin 


1994).  Rahimi  and  Recht  (2008)  exploit  this  fact  to  construct  a  random 


hnite-dimensional  RKHS  77  by  drawing  d  vectors  from  p.  These  vectors  p^, . . . ,  define 
the  following  random  2(7-dimensional  feature  map 


A-)  =  ‘''ycos((pi,')),sm({pi,-)),...,cos((p^,-)),sm((pj,->)|' 


(4.5) 


Inner-products  in  the  random  feature  space  approximate  uniformly,  and  to  arbitrary 

precision  for  sufficiently  large  parameter  d,  as  restated  in  Lemma  37  We  denote  the  inner- 

(2008)  applied  this  approxi- 


Rahimi  and  Recht 


product  in  the  random  feature  space  by  k. 
mation  to  large-scale  learning.  For  large-scale  learning,  good  approximations  can  be  found 


for  d  n.  Table  4.1  presents  three  important  translation-invariant  kernels  and  their  trans¬ 


formations.  Here  regularized  ERM  is  performed  in  77,  not  to  avoid  complexity  in  n,  but  to 
provide  a  direct  finite  representation  w  of  the  primal  solution  in  the  case  of  infinite  dimen¬ 
sional  feature  spaces.  After  performing  regularized  ERM  in  77,  appropriate  Laplace  noise  is 
added  to  the  primal  solution  w  to  guarantee  differential  privacy  as  before. 
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Algorithm  12  PrivateSVM 

Inputs:  database  D  =  {(x*,  with  Xj  G  Hi  G  {—1,1};  translation-invariant  kernel 

/c(x,  y)  =  5f(x  —  y)  with  Fonrier  transform  p{uj)  =  2“^  J  g(x)  dx;  convex  loss 

fnnction  £;  parameters  AjC  >  0  and  d  G  N. 


1.  p^, . . . ,  ^Draw  i.i.d.  sample  of  d  vectors  in  from  p; 


2.  CK  ^  Run  Algorithm  10  on  D  with  parameter  C,  kernel  k  induced  by  map  (4.5),  and 
loss  £] 


3.  w  ^  Er=i  yidiicj)  (xj)  where  0  is  defined  in  Equation  (|4.5[); 

A.  p  Draw  i.i.d.  sample  of  2d  scalars  from  Laplace  (0,  A);  and 
5.  Return  w  =  w  ^  and  Pi,  ■  ■  ■ ,  p^ 


PrivateSVM  is  computationally  efficient.  Algorithm 
each  entry  of  the  kernel  matrix,  or  a  total  time  of  0{dn^ 


12 


takes  0{d)  time  to  compute 


on  top  of  running  dual  SVM 
in  the  random  feature  space  which  is  worst-case  O(n^)  for  the  analytic  solution  (where 
<  n  is  the  nurnber  of  support  vectors),  and  faster  using  numerical  methods  such  as 


n 


chunking  (Burges,  1998).  To  achieve  (e,  (5)-usefulness  wrt  the  hinge-loss  SVM  d  must  be 


to 


taken  to  be  O  (log  }  -|-log}))  (c/.  Corollary  39).  By  comparison  it  takes  0{dv?) 
construct  the  kernel  matrix  for  any  translation-invariant  kernel. 

As  with  the  SVM  and  PrivateSVM-Finite,  the  response  of  Algorithm  1^  can  be 
used  to  make  classihcations  on  future  test  points  by  constructing  the  classifier  /*(■)  = 
/w(')  =  (w,0(-)).  Unlike  the  previous  mechanisms,  however,  PrivateSVM  must  include  a 

parametrization  of  feature  map  0 — the  sample  — in  its  response.  Of  PrivateSVM ’s 

total  response,  only  w  depends  on  database  D.  The  p^  are  data-independent  vectors  drawn 
from  the  transform  p  of  the  kernel,  which  we  assume  to  be  known  by  the  adversary  (to 
wit  the  adversary  knows  the  mechanism  itself,  including  k).  Thus  to  establish  differential 
privacy  we  need  only  consider  the  data- dependent  weight  vector,  fortunately  we  can  build 
on  the  case  of  PrivateSVM-Finite. 


Corollary  33  (Privacy  of  PrivateSVM).  For  any  /3  >  0,  database  D  of  size  n,  C  >  0,  d  E 
N,  loss  function  i{y,y)  that  is  convex  and  L-Lipschitz  in  y,  and  translation-invariant  kernel 

k,  PrivateSVM  run  on  D  with  loss  I,  kernel  k,  noise  parameter  A  >  2'^'^ LC / {jSn) ,  ap¬ 
proximation  parameter  d,  and  regularization  parameter  C  guarantees  /3- differential  privacy. 


Proof.  The  result  follows  immediately  from  Theorem  31  since  w  is  the  primal  solution  of 
SVM  with  kernel  /c,  the  response  vector  w  =  w  ^  for  i.i.d.  Laplace  and  A:(x,  x)  =  1  for 
all  X  G  □ 


This  result  is  surprising,  in  that  PrivateSVM  is  able  to  guarantee  privacy  for  regular¬ 
ized  ERM  over  a  function  class  of  inhnite  VC-dimension,  where  the  obvious  way  to  return 
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the  learned  classifier  (responding  with  the  dnal  variables  and  feature  mapping)  reveals  all 
the  entries  corresponding  to  the  support  vectors,  completely. 

Like  PrivateSVM-Finite,  PrivateSVM  is  useful  with  respect  to  the  SVM.  If  we 
denote  the  function  parametrized  by  intermediate  weight  vector  w  by  /,  then  the  same 
argument  for  the  utility  of  PrivateSVM-Finite  establishes  the  high-probability  proximity 
of  /  and  /*. 


Lemma  34.  Consider  a  run  of  Algorithms  10  and  12  with  d  G  N,  C  >  0,  convex  loss 


and  translation-invariant  kernel.  Denote  by  f*  and  f  the  classifiers  parametrized  by  weight 
vectors  w  and  w  respectively,  where  these  vectors  are  related  by  w  =  w  /j,  with  pL 


lid 


Laplace(0,  A) 
0  <  A  <  min 


in  Algorithm  ii 


For 


24  log^2V<h 


sioge  f 


then  Pr 


any 
/*  -  / 


> 


<1 


0  and  5 


>1-|. 


(0,1),  ^/ 


Proof.  As  in  the  proof  of  Theorem [3^ we  can  use  the  Chernoff  trick  to  show  that,  for  Erlang 
2d-distributed  random  variable  X,  the  choice  of  t  =  (2A)“^,and  for  any  e  >  0 


Pr 


/*  -  / 


< 

< 


E  [e*^] 

ge4\/j/2 

(1  - 

22dg-e\/(i/(4A) 

exp  ^dloggd  —  e'\/ci/(4A)j 


Provided  that  A  < 

A<eyj/(81oggf) 


e/  ^2'‘logg2\/jj  this  is  bounded  by  exp  ^-e\/c!/(8A) j . 
then  the  claim  follows. 


Moreover  if 

□ 


To  show  a  similar  result  for  /*  and  /,  we  exploit  smoothness  of  regularized  ERM  with 
respect  to  small  changes  in  the  RKHS  itself.  To  the  best  of  our  knowledge,  this  kind  of 
stability  to  the  feature  mapping  has  not  been  used  before.  We  begin  with  a  technical  lemma 
that  we  will  use  to  exploit  the  convexity  of  the  regularized  empirical  risk  functional. 


Lemma  35.  Let  R  be  a  functional  on  Hilbert  space  R  satisfying  R[f]  >  R[f*]  +  |||/  —  f*\\h 

for  some  a  >  0,  f*  E  R  and  all  f  E  R.  Then  R[f]  <  R[f*]  e  implies  \\f  —  /*||-^  < 
for  all  e  >  0,  f  E  R. 


Proof.  By  assumption  and  the  antecedent 


11/ -/111  <  -m]-Rin) 

^  a 

<  -(R\r]  +  f--R\r]) 

Oj 

=  2e/a  . 


Taking  square  roots  of  both  sides  yields  the  consequent. 


□ 
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Provided  that  the  kernel  functions  k  and  k  are  uniformly  close,  the  next  lemma  exploits 
insensitivity  of  regularized  ERM  to  perturbations  of  the  feature  mapping  to  show  that  /* 
and  /  are  pointwise  close. 


Lemma  36.  Let  Li  be  an  RKHS  with  translation-invariant  kernel  k,  and  let  Li  be  the  random 


RKHS  corresponding  to  feature  map  (4.5)  induced  by  k.  Let  C  be  a  positive  scalar  and  loss 


i{y,y)  be  convex  and  L-Lipschitz  continuous  in  y.  Consider  the  regularized  empirical  risk 
minimizers  in  each  RKHS 


C  ,  1 

/*  e  argmin  -^£(2/i,/(xi))  + -||/||^ 
2  =  1 

C  ^  1 

g*  e  argmin  —  V£(?/i,5((xi))  + 
sen 


Let  M.  be  any  set  containing  xi, . . . ,  x„.  For  any  e  >  0,  if  the  dual  variables  from  both 


optimizations  have  Li-norms  bounded  by  some  A  >  0  and 


k-k 


< 


oo;J\4 


mm 


’  22  (a+2^(CL+A/2)a)  " 


then  II/*  —  g* 


\oo\M 


<  e/2. 


Proof.  Denote  the  empirical  risk  functional  Remp[/]  =  Sr=i  ^  regu¬ 

larized  empirical  risk  functional  Rreg[/]  =  C'Rempi/]  +  ll/lP/2,  for  the  appropriate  RKHS 
norm  (either  Li  oi  Li).  Let  /*  denote  the  regularized  empirical  risk  minimizer  in  Li,  given 
by  parameter  vector  ck*,  and  let  g*  denote  the  regularized  empirical  risk  minimizer  in  Li 
given  by  parameter  vector  /3*.  Let  g^*  =  YJi=i  and  /^*  =  YJi=i  de¬ 

note  the  images  of  /*  and  g*  under  the  natural  mapping  between  the  spans  of  the  data  in 
RKHS’s  Li  and  Li  respectively.  We  will  hrst  show  that  these  four  functions  have  arbitrarily 
close  regularized  empirical  risk  in  their  respective  RKHS,  and  then  that  this  implies  uniform 
proximity  of  the  functions  themselves.  First  observe  that  for  any  g  ELi 


-^reg[fi']  ~  dl*  Rempifi']  +  2  IIS'll'H 

>  C  {dgRerap[g*],  9  “  9*)ii  +  C  Remp[5'*]  + 

=  {dgR'^g[g*],g  -  g*)gi- {g*,g  -  g*)gg  + C Remp[9*]  + ^WgW^l  ■ 


The  inequality  follows  from  the  convexity  of  Remp[']  and  holds  for  all  elements  of  the  subd¬ 
ifferential  dgRemp[g*]-  The  subsequent  equality  holds  by  dgRffg[g]  =  C dgRemp[g]  +  9-  Now 
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since  0  e  9gi?^g[(7*],  it  follows  that 


Rveg[9]  ^  C  Remp[9  ]  +  dIIS'II'H  (9  i9  9  )ii 


2 

=  +  \h'\\\  +  Ihfn -  -  (s', a- 9*>« 

=  <l9*l  +  ^ll9llrt-(9*.9)«  +  ^"''*"" 


H 


I  I  2 


=  Rlig[9^]  +  ^\\9-9\^n 


With  this,  Lemma  ^  states  that  for  any  g  Elhi  and  e'  >  0, 

R^eg[9]  <  R]LgW]  +  ^'  \\9-9*\\h  <  ^  • 

Next  we  will  show  that  the  antecedent  is  true  for  g  =  g^* 

<  e'  > ,  for  all  X  G 


k-k 


oo:Ai 


(4.6) 

Conditioned  on 


|/*(x)  -^c.*(x)|  = 


'^a*yi  (fc(xi,x)  -  fc(xi,x 


2  =  1 


<  ^  |a*|  fc(xi,x)  -  A;(xi,x) 


2=1 

<  e'llal 


<  e'A  , 

by  the  bound  on  ||q:*||i.  This  and  the  Lipschitz  continuity  of  the  loss  leads  to 


(4.7) 


<.[/*)  -  <.[9»*1 


Cflp„p|/*1  -  +  \\\nl  -  1||9. 


H 


<  — 


^  ^  1^  hju  /*(xi))  -  i  {yu  ^a*(xi))|  +  ^ 
2=1 


a*  K-K  CK 


<  -Y^L\\r-g, 

n 


2=1 


loo;A^  R  2 


CK*  K-K  CK 


<  CL\\r-g, 


oc^Wco-M  9  11^  111 


<  CL\\r-g^4^.^^  +  -\\c^Xe' 

<  C'Le'A  +  AV/2 
=  (^CL+^]Ae' 


K-K  a 
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Similarly, 


RliJg*]  -  <  {CL  +  A/2) Ae'  by  the  same  argument.  And  since 


^reg  ILf  ]  -*■  '"reg  L 

-Rreg[//3*]  >  -Rreg[/*]  and  Rf^g[ga*]  >  Rfegia*]  we  have  proved  that  <  R^^egif*]  + 

{CL  +  A/2)Ae'  <  R/ig[fp*]  +  {CL  +  A/2)Ae'  <  R%[g^]  +  2{CL  +  A/2)Ae'.  And  by  implica¬ 
tion  (|A6), 


Wqc 


s'Wn  <  ^/{ 


CL  +  —  )  Ae'  . 


(4.8) 


Now  A;(x,  x)  =  1  for  each  x  G  implies 


|^a*(x) -/(x)|  =  (gc*  -  g*,k{^,-) 


n 


<  \\9c.^-9^ 


m 


k{'K,  x) 


=  \\9c 


9  \\h 


This  combines  with  Inequality  (4.8)  to  yield 

he 


9  \\oo-,M  — 


CL  +  —  )  Ae' 


Together  with  Inequality  (4.7)  this  hnally  implies  that  \\f*  —  9 
e'A  +  2\l {CL  -|-  A/2)  Ae',  conditioned  on  event  1 1 1  fc  — 


*11  < 
oo;A4  — 


<  e'  For  desired  ac¬ 


curacy  e  >  0,  conditioning  on  event  A^i  with  e'  =  min  <  e/  2  ^A  -|-  2y'{CL  -|-  A/2)  Aj  , 


/  2  (^A  +  2^{CL  +  A/2)A^  |  yields  bound  ||/*  -  g*\\^.j^  <  e/2:  if  e'  <  1  then  e/2  > 

^(a  +  2V(C'L  +  A/2)a)  > 


e'  < 
e/2 

e'  <  e/ 
the  resu’ 


/  2  (A  +  2^(C'L  +  A/2)A 


e'A  -|-  2^J{CL  -|-  A/2)  Ae'  provided  that 

2 

Otherwise  if  e'  >  1  then  we  have 

>  e' ^A -|- 2 a/(O^T^A72)//V j  >  e'A  -|-  2^/lJjJ/RhAlT)~KR  provided 

2  ^A  -|-  2^J{CL  +  A/2)  A^  .  Since  for  any  FT  >  0,  min  {FT,  77^}  >  min  {1,77^}, 

□ 


t  follows. 


We  now  recall  the  result  due  to  Rahimi  and  Recht  (2008)  that  establishes  the  non- 
asymptotic  uniform  convergence  of  the  kernel  functions  required  by  the  previous  Lemma 
(he.,  an  upper  bound  on  the  probability  of  event  A^/). 

Lemma  37  ([Rahimi  and  Redit  2008,  Claim  1).  For  any  e  >  0,  5  G  (0,1),  translation- 


invariant  kernel  k  and  compact  set  AA  C  if  d  > 


4{d+2) 


loge 


rithm 


Pr 


2^(f7pdiam(A'l))^ 


,  then  Algo- 


k-k 


l^’s  random  feature  mapping  0  defined  in  Eguation  (4.5)  satisfies 
<  e)  >1  —  5,  where  Up  =  E  [(n;,n;)]  is  the  second  moment  of  the  Fourier 


transform  p  of  k’s  g  function. 
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Combining  these  ingredients  establishes  utility  for  PrivateSVM. 


Theorem  38  (Utility  of  PrivateSVM).  Consider  any  database  D,  compact  set  M 
containing  D,  convex  loss  i,  translation-invariant  kernel  k,  and  scalars  C,e  >  0  and 
6  G  (0,1).  Suppose  the  SVM  with  loss  I,  kernel  k  and  parameter  C  has  dual  variables 
with  Li-norm  bounded  by  A 


rameters  d  > 

—  e(e) 


loge 


Then  Algorithm  \T^  run  on  D  with  loss  i,  kernel  k,  pa- 

X  < 


2®  ((Tpdiam(A^))^ 

mk) 


where  6{e)  =  min  <  1, 


24  (A+2y'(CL+A/2)A) 


mm 


24log^2\/ci’ 

loss  i,  kernel  k  and  parameter  C,  wrt  the 


and  C  is  {e,  6) -useful  with  respect  to  Algorithm 
Woo-, M-norm. 


10 


run  on  D  with 


Proof.  Lemma’s  36  and  34  combined  via  the  triangle  inequality,  with  Lemma  together 
establish  the  result  as  follows.  Dehne  A  to  be  the  conditioning  event  regarding  the  approx¬ 


imation  of  k  by  k,  denote  the  events  in  Lemma’s 


36 


and 


32 


by  B  and  C  (beware  we  are 


overloading  C  with  the  regularization  parameter;  its  meaning  will  be  apparent  from  the 
context),  and  the  target  event  in  the  theorem  by  D. 


A  =  <  k-k 


<  min  <  1, -  . 

’  22  (a  +  2^(C'L+  f)  Ay 


B  = 
C  = 
D  = 


r-f 
/*  -  / 
r-f 


oo;A4 

<  e 

D 

oo;A4 


<  e/2 


<  e 


The  claim  is  a  bound  on  Pr(Zi)).  By  the  triangle  inequality  events  B  and  C  together  imply 
D.  Second  note  that  event  C  is  independent  of  A  and  B.  Thus  Pr(Zi)  |  A)  >  Pt{B  fl  C  \ 
A)  =  Pr(i?  I  A)  Pr(C)  >  1  ■  (1  —  5/2),  for  sufficiently  small  A.  Finally  Lemma  37  bounds 
Pr(A)  as  follows:  provided  that  d  >  A{d  +  2)  logg  (2®  (crpdiam(A4))^  /  (56'(e)))  /6{e)  where 

0(e)  =  min|l,eV  2  (a  +  2^(01  +  A/2)  a)  we  have  Pr(A)  >  1  —  5/2.  Together  this 

yields  Pt{D)  =  Pt{D  \  A)  Pr(A)  >  (1  -  6/2f  >1-5.  □ 


Again  we  see  that  utility  and  privacy  place  competing  constraints  on  the  level  of  noise 
A.  Next  we  will  use  these  interactions  to  upper-bound  the  optimal  differential  privacy  of 
the  SVM. 
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4.5  Hinge-Loss  and  an  Upper  Bound  on  Optimal  Dif¬ 
ferential  Privacy 

We  begin  by  plugging  hinge-loss  i{y,y)  =  (1  —  yy)+  into  the  main  results  on  privacy  and 
utility  of  the  previous  section  (similar  computations  can  be  done  for  PrivateSVM-Finite 
and  other  convex  loss  functions).  The  following  is  the  dual  formulation  of  hinge-loss  SVM 
learning: 


max 

11 

1 

'  2 

i=l 

s.t. 

0  <  Oj 

< 

EE 

i=l  j=l 

c . 


(4.9) 


n 

Corollary  39.  Consider  any  database  D  of  size  n,  scalar  C  >  0,  and  translation-invariant 
kernel  k. 

i.  For  any  /3  >  0  and  d  G  N,  PrivateSVM  run  on  D  with  hinge-loss,  noise  parameter 

A  >  ^  ’  approximation  parameter  d,  and  regularization  parameter  C ,  guarantees 

(3 -differential  privacy. 

a.  Moreover  for  any  compact  set  C  containing  D,  and  scalars  e  >  0  and  6  G  (0, 1), 
PrivateSVM  run  on  D  with  hinge-loss,  kernel  k,  noise  parameter 


A  <  min 


E — approximation  parameter  d  >  logg 


2®  ((jpdiam(A4))'' 


with  d(e)  =  min 
loss  SVM  run  on  D  with  kernel  k,  and  parameter  C . 


and  regularization  parameter  C,  is  {e,  5) -useful  wrt  hinge- 


Proof.  The  hrst  result  follows  from  Theorem  and  the  fact  that  hinge-loss  is  convex 
and  1-Lipschitz  on  M:  he.,  dyi  =  1[1  >  yy]  <  1.  The  second  result  follows  almost  im¬ 
mediately  from  Theorem  ^  For  hinge- loss  we  have  that  feasible  afs  are  bounded  by 
C/n  (and  so  A  =  C)  by  the  dual’s  box  constraints  and  that  L  =  1,  implying  we  take 

'  . . .  ’  .  □ 


6{e)  =  min  <  1, 


24c4(i+V6) 


.  This  is  bounded  by  the  stated  6{e). 


Combining  the  competing  requirements  on  noise  level  A  upper-bounds  optimal  differen¬ 
tial  privacy  of  hinge-loss  SVM. 

Theorem  40.  The  optimal  differential  privacy  for  hinge-loss  SVM  learning  on  translation- 
invariant  kernel  k  is  bounded  by  f3{e,  6,  C,  n,  i,k)  =  O  ^^-y/log  ^  (log  ^  -|-  log^  j  . 

Proof.  Consider  hinge-loss  in  Corollary  39  Privacy  places  a  lower  bound  of 
f  >  2‘^'^C^/lt/ (An)  for  any  chosen  A,  which  we  can  convert  to  a  lower  bound  on  (3  in  terms  of 
e  and  5  as  follows.  For  small  e,  we  have  9(e)  =  and  so  to  achieve  (e,  d)-usefulness 

we  must  take  d  =  O  ( ^  logg  (^) )  •  There  are  two  cases  for  utility,  if  A  =  e/  ^2'^  logg  ^2 \/d j  j 
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Figure  4.1:  For  each  i  E  [2],  the  SVM’s  primal  solution  w*  on  database  Di  constructed  in  the 
proof  of  Lemma  41,  corresponds  to  the  crossing  point  of  line  y  =  w  with  y  =  w  —  dy^fi{w). 
Database  Di  is  shown  on  the  left,  database  D2  is  shown  on  the  right. 


then  /3  =  O 
ond  case,  with  A  = 


=  O  ^^ylog  ^  (log  ^  +  log^  j-  Otherwise  we  are  in  the  sec- 

yielding  /S  =  O  log  I)  which  is  dominated  by  the  hrst  case  as 

□ 


A  natural  question  arises  from  this  discussion:  given  any  mechanism  that  is  (e,  (5)-useful 
with  respect  to  hinge  SVM,  for  how  small  a  (3  can  we  possibly  hope  to  guarantee  (3- 
differential  privacy?  In  other  words,  what  lower  bounds  exist  for  the  optimal  differential 
privacy  for  the  SVM? 


4.6  Lower  Bounding  Optimal  Differential  Privacy 

We  now  present  lower  bounds  on  the  level  (3  of  differential  privacy  achievable  for  any 
(e,  5)-useful  mechanism  with  respect  to  the  hinge-loss  SVM.  We  consider  both  mechanisms 
for  linear  kernels  and  mechanisms  for  RBF  kernels. 

4.6.1  Lower  Bound  for  Linear  Kernels 

In  this  section  we  present  a  lower  bound  on  the  level  of  differential  privacy  for  any  mecha¬ 
nism  approximating  hinge-loss  linear  SVM  with  high  accuracy.  The  hrst  lemma  corresponds 
to  a  kind  of  negative  sensitivity  result:  for  a  particular  pair  of  neighboring  databases  we 
show  that  the  SVM  is  sensitive. 

Lemma  41.  For  any  C  >  0,  n  >  1  and  0  <  e  <  there  exists  a  pair  of  neighboring 
databases  Di,  D2  on  n  entries,  such  that  the  functions  ff,  ff  parametrized  by  SVM  run  with 
parameter  C,  linear  kernel,  and  hinge-loss  on  Di,D2  respectively,  satisfy  ||/*  —  /2II00  >  2e. 
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Proof.  We  construct  the  two  databases  on  the  line  as  follows.  Let  0  <  m  <  M  be  scalars 
to  be  chosen  later.  Both  databases  share  negative  examples  Xi  =  ...  =  x\ni2\  =  —M 
and  positive  examples  x\n/2\+i  =  ...  =  Xn-i  =  M.  Each  database  has  Xn  =  M  —  m,  with 
Hn  =  —I  for  Di  and  i/n  =  ^  for  D2.  In  what  follows  we  use  subscripts  to  denote  an  example’s 
parent  database,  so  {xij,yij)  is  the  example  from  Di.  Consider  the  result  of  running 
primal  SVM  on  each  database 

1  C 

=  argmin  (1  -  yi^iWXi^i) 

i=\ 

\  (j  ^ 

wl  =  argmm  +  y2,iWX2,i)^  . 

i=l 


Each  optimization  is  strictly  convex  and  unconstrained,  so  the  optimizing  are  char¬ 

acterized  by  the  first-order  KKT  conditions  0  G  d^fiiw)  for  fi  being  the  objective  function 
for  learning  on  D^,  and  denoting  the  sub  differential  operator.  Now  for  each  i  G  [2] 


where 


C 


dwfi{w)  =  w  -  —  ^  [1  “ 


i=i 


r  {0} , 

if  X  <  0 

l[x]  =  <[0,1], 

if  X  =  0 

[{1}  > 

if  X  >  0 

is  the  subdifferential  of  (x)+.  Thus  for  each  i  G  [2],  w*  G  ^  [1  ~ 

which  is  equivalent  to 

C'M(n-l)~ri  J  C{m-M), 


w1  G 


n 


Wo  G  - ^ - -1 

n 


M 


M 


w. 


Wo 


+ 


+ 


n 


w. 


C{M  -m). 


n 


M  —  m 


m  —  M 


Wo 


The  RHSs  of  these  conditions  correspond  to  decreasing  piecewise-constant  functions,  and  the 
conditions  are  met  when  the  corresponding  functions  intersect  with  the  diagonal  y  =  x  line, 

C{M{n-2)+m)  ^  _j_  _  C(M(n-2)+m)  C{Mn-m)  ^  J_ 


4.1 


as  shown  in  Figure 

then  w*2  =  provided  that  ^  =  max  |  c{Mn-m) 


have  =  —  |Tf 


M  '  n 

m\.  So  taking  M  =  ^  and  m  =  this  implies 


M 

we 


Wfi-f: 


II 

2  II  CXD 


c 

>  |/l(l)-/2(l)l 

=  -  W2\ 

=  2e  , 


provided  e  <  ^. 


□ 
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Next  using  a  probabilistic  method  argument,  we  show  that  the  negative  sensitivity  result 
leads  to  a  lower  bound. 

Theorem  42  (Lower  bound  on  optimal  differential  privacy  for  hinge-loss  SVM).  For  any 
C  >  0,  n  >  1,  (5  G  (0, 1)  and  e  G  ^0,  the  optimal  differential  privacy  for  the  hinge-loss 
SVM  with  linear  kernel  is  lower-bounded  by  logg  In  other  words,  for  any  C,/3  >  0  and 
n  >  1  if  a  mechanism  M  is  {e,  6)-useful  and  f- differentially  private  then  either  e  >  ^  or 
5>exp(-/3). 

Proof.  Consider  (e,  5)-useful  mechanism  M  with  respect  to  SVM  learning  mechanism  M 
with  parameter  C*  >  0,  hinge- loss  and  linear  kernel  on  n  training  examples,  where  5  >  0 
and  ^  >  e  >  0.  By  Lemma  41  there  exists  a  pair  of  neighboring  databases  Di,D2  on  n 


entries,  such  that  \\ff  -  ff\\oc  >  2e  where  /*  =  fM{Di)  for  each  i  G  [2],  Let  f  =  fM^Oi) 
each  i  G  [2],  Then  by  the  utility  of  M, 


Pr 

Pr 


1  e  BT  iff 

2  e  BT  iff 


>  1  -  5  , 

<  Prf/2^Sg“(/2*))  <  5  . 


(4.10) 

(4.11) 


Let  Vi  and  V2  be  the  distributions  of  M{Di)  and  M{D2)  respectively  so  that  Viif)  = 
Pr 


)  =  t].  Then  by  Inequalities  (4.10)  and  (4.11) 


E. 


T'^'Pi 


dV2{T) 

[dVi{T) 


T  G  BT  iff) 


dViit) 


^^^dViit) 


dFi(t) 


< 


Thus  there  exists  a  t  such  that  log  >  log 


1-5 


□ 


4.6.2  Lower  Bound  for  RBF  Kernels 

To  lower  bound  the  level  /3  of  differential  privacy  achievable  for  any  (e,  5)-useful  mech¬ 
anism  with  respect  to  an  RBF  hinge-loss  SVM,  we  again  begin  with  negative  sensitivity 
result  for  the  SVM.  But  now  we  can  exploit  the  RBF  kernel  to  construct  a  sequence  of  N 
pairwise  neighboring  databases  whose  images  under  SVM  learning  form  an  e-packing.  By 
using  the  RBF  kernel  with  shrinking  variance  parameter,  we  can  achieve  this  for  any  N. 


there  exists  a 


Lemma  43.  For  any  C  >  Q,  n  >  C ,  Q  <  e  <  and  0  <  cr  <  y  2iog  2 

pairwise-neighboring  databases  {A}^i  on  n  examples,  such  that  the 


set  of  N  = 


loge2 


functions  ff  parametrized  by  hinge-loss  SVM  run  on  Di  with  parameter  C  and  RBF  kernel 
with  parameter  a,  satisfy  ||/*  —  /*||^  >  2e  for  each  i  j ■ 


109 


Proof.  Construct  TV  >  1  pairwise  neighboring  databases  each  on  n  examples  in  as  follows. 
Each  database  i  has  n  —  1  negative  examples  Xj  i  =  . . .  =  Xj  ,^_i  =  0,  and  database  Di 
has  positive  example  Xj,^  =  (cos  0*,  sin  where  9i  =  Consider  the  result  of  running 
SVM  with  hinge-loss  and  RBF  kernel  on  each  Di.  For  each  database  /c(xj^s,  Xj^i)  =  1  and 
=  exp  (— ^)  =:  7  for  all  G  [n  — 1]  .  Notice  that  the  range  space  of  7  is  (0, 1). 
Since  the  inner-products  and  labels  are  database-independent,  the  SVM  dual  variables  are 
also  database-independent.  Each  involves  solving 


max  Oil - 01'  I  ^  I  CK 

ctSR’"  2  \  —7  1  J 

c 

S.t.  0  <  CK  <  — 1 

n 

By  symmetry  =  ...  =  a^_i,  so  we  can  reduce  this  to  the  equivalent  program  on  two 
variables: 


,  n- 

max  OL  _ 

cieiR2  y  1 

s.t.  0  <  CK  < 


(n  —  1)^  —lip  —  1) 
— 7(n  —  1)  1 


CK 


Consider  first  the  unconstrained  program.  In  this  case  the  necessary  first-order  KKT  con¬ 
dition  is  that 


This  implies 

a* 


{n-lf  -7(n-l)\  * 

V  1  ;  V -7(^-1)  1  ) 


(  [n  —  1)^  —l{n  ^  f  n  —  1  \ 

V -7(^-1)  1  J  V  1  ; 

1  /  1  7(n  —  1)\  /  n  —  1  \ 

(n -1)2(1- 72)  7(n-l)  (n-l)2  J  1  J 

1  f  1  7(n  —  1)\  /  n  — 1\ 

(n-l)2(l-7)(l  +  7)  V  7(n-l)  (n  -  1)^  J  1  J 

_ ^ _  f  («-l)(l  +  7)  ^ 

(^_  1)2(1  +  ^  (^_  1)2(1+^)  J 

[  C-1K1-7)  ^ 


Since  this  solution  is  strictly  positive,  it  follows  that  at  most  two  (upper)  constraints  can 
be  active.  Thus  four  cases  are  possible:  the  solution  lies  in  the  interior  of  the  feasible  set, 
or  one  or  both  upper  box-constraints  hold  with  equality.  Noting  that  it 

follows  that  CK*  is  feasible  iff  <  7.  This  is  equivalent  to  C  >  >  n,  since  7  G  (0, 1). 

This  corresponds  to  under-regularization. 
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If  both  constraints  hold  with  equality  we  have  ct*  =  ^1,  which  is  always  feasible. 

In  the  case  where  the  first  constraint  holds  with  equality  =  ^,  the  second  dual  variable 
is  found  by  optimizing 


ttn  = 


maxcK 

«2eiR 


n  —  1 
1 


-Ct 

2 


(n  —  1)^  —7{n  —  1) 

-7(n  —  1)  1 


Ct 


C(n-1)  1  fC(n-l) 

=  max - -  +  C(2  -  -  \  - 

a2eiR  n  2  \  \  n 


1  2  A  C'y{n  —  1) 

=  max  — an  +  0:2  1  H - 

«26iR  2  \  n 


^C^in  —  1)  2 

2 - 0:2  +  q;2 


n 


implying  =  1  +  This  solution  is  feasible  provided  1  +  <  7  iff 

Again  this  corresponds  to  under-regularization. 

Finally  in  the  case  where  the  second  constraint  holds  with  equality  =  —,  the  first 
dual  is  found  by  optimizing 


Oif)  — 


,  / 

n  —  1 

=  max  Ct 

aiSM  \ 

1 

=  max(n  — 

■  Ijtti  -|- 

aiSM 

1 

=  max  —  ( 

[n  —  1)^ 

a2S]R  2 

n 


-Ct 

2 

1 

’  2 


(n  —  1)^  —  1) 

-7(n  —  1)  1 

^  n  —  1 


Ct 


\2„  2 


n 


Oil  H - 7 


9i 

n 


1+^  1+^  c 

implying  a\  =  "^2 ■  This  is  feasible  provided  <  7-  Passing  back  to  the  program  on 

n  variables,  by  the  invariance  of  the  duals  to  the  database,  for  any  pair  Di,  Dj 

\fi  fj  CXn  (f  ^  ^j,n}) 

>  a*  ll  -  max  k  (xi,„,  Xg,„) 

Now  a  simple  argument  shows  that  this  maximum  is  equal  to  7"^  exp  (sin^  for  all  i. 
The  maximum  objective  is  optimized  when  \q  —  i\  =  1.  In  this  case  \9i  —  9q\  = 

The  norm  ||xj  „  — Xg^„||  =  2  sin  =  2sin^  by  basic  geometry.  Thus  /c  (xj^„,  Xg^„)  = 

exp  =  gxp  (— ^  sin^  fj)  =  y'^exp  (sin^  fj)  as  claimed.  Notice  that  iV  >  2  so 

the  second  term  is  in  (1,  e],  while  the  first  term  is  in  (0, 1).  In  summary  we  have  shown  that 
for  any  i  j 


\fi  (xi,n)  -  fj  (xi,n)|  >  (^l-  exp  sin^  ^  <  • 

Assume  7  <  f.  If  n  >  C*  then  n  >  ^  >  (1  —  yjC*  in  which  implies  case  1  is  infeasible. 
Similarly  since  Cy^^  >  0,  n  >  C*  implies  1  -|-  Cy^^  >  1  >  ^  which  implies  case  3  is 


Ill 


infeasible.  Thus  provided  that  7  <  |  and  n  >  C  we  have  that  either  case  2  or  case  4  must 
hold.  In  both  cases  a*  =  ^  giving 


l/i  (Xi,n)  -  /i  (Xi,n)|  > 


^1  —  exp 


C 

n 


Provided  that  a  <  sin  f  we  have  (l  -  exp  (-^  sin^  ^))  ^  >  (i  -  1)  P  =  £.  Now 

for  small  x  we  can  take  the  linear  approximation  sin  a;  >  for  x  G  [0, 7r/2].  If  iV  >  2  then 
sin;|  >  Thus  in  this  case  we  can  take  a  <  to  imply  |/i(xj^„)  -  fj  (xi_„)|  > 

This  bound  on  a  in  turn  implies  the  following  bound  on  7:  7  =  exp  (— ^)  < 
^  •  Thus  taking  iV  >  4,  in  conjunction  with  a  <  ^ inipties  7  < 


exp 


Rather  than  selecting  N  which  bounds  a,  we  can  choose  N  in  terms  of  a.  a  <  \J is 
implied  by  At  =  So  for  small  a  we  can  construct  more  databases  leading  to  the 


desired  separation.  Finally,  At  >  4  implies  that  we  must  constrain  a  <  \J 2\og  2' 

In  summary,  ii  n  >  C  and  cr  <  ^ I/*  “  fj  (^*,ri)|  >  £  for  each  i  ^ 

j  e  [At]  where  N  = 


loge2 


i  ~  /7II00  —  as  claimed. 


Moreover  if  e  <  £  then  for  any  i  ^  j  this  implies 

□ 


We  again  use  a  similar  argument  as  in  the  linear  kernel  section  above,  to  derive  the  lower 
bound  on  differential  privacy. 

Theorem  44  (Lower  bound  on  optimal  differential  privacy  for  hinge-loss).  For  C  >  0, 
n  >  C,  6  G  (0,1),  e  G  (O,  £),  and  a  <  \J 2\og,  2  optimal  differential  privacy  for  the 
hinge  SVM  with  RBF  kernel  having  parameter  a  is  lower-hounded  by  logg  where 

^  ~  I  \/log^  ■  T/iof  is,  under  these  conditions,  all  mechanisms  that  are  {e,  6) -useful  wrt 
hinge  SVM  with  RBF  kernel  for  any  a  do  not  achieve  differential  privacy  at  any  level. 

Proof.  Consider  (e,  5)-useful  mechanism  M  with  respect  to  hinge  SVM  learning  mechanism 
M  with  parameter  C  >  0  and  RBF  kernel  with  parameter  0  <  a  <  \J 2\og  2  ^  training 

examples,  where  5  >  0  and  £  >  e  >  0.  Let  N  =  4^/ >  4.  By  Lemma 


43 


there  exist 


pairwise  neighboring  databases  Di, . . . ,  of  n  entries,  such  that  is  an  e-packing 

wrt  the  Loo-norm,  where  /*  =  fM{Di)-  So  by  the  utility  of  M,  for  each  i  G  [V] 


>  1-S  ,  (4.12) 

Yi  Pr  (/i  e  Br  (/;))  <  Pr  (/i  i  Bf  (/,*))  <  S  , 

^  aijll,  Pr(/;€Br(/*))  <  .  (4.13) 
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Let  Vi  and  Vj  be  the  distributions  of  M{Di)  and  M{Dj)  respectively  so  that  for  each, 
Vi{t)  =  Pr  (M{Di)  =  t).  Then  by  Inequalities  (|4.12[)  and  (4.13) 


'dV,{T) 

.dV,{T) 


T  e  BT  (/;) 


dPjit)^' n'')  ^  S 


Thus  there  exists  a  t  such  that  log  >  log  P— _  □ 

Note  that  n  >  C  is  a  weak  condition,  since  C  should  grow  like  ^/n  for  universal  consis¬ 
tency.  Also  note  that  this  negative  result  is  consistent  with  our  upper  bound  on  optimal 
differential  privacy:  a  affects  Up,  increasing  the  upper  bounds  as  a  0. 


4.7  Summary 

In  this  chapter  we  present  a  pair  of  new  mechanisms  for  private  SVM  learning.  In  each 
case  we  establish  differential  privacy  via  the  algorithmic  stability  of  regularized  empirical  risk 
minimization.  To  achieve  utility  under  inhnite-dimensional  feature  mappings,  we  perform 
regularized  ERM  in  a  random  Reproducing  Kernel  Hilbert  Space  whose  kernel  approximates 
the  target  RKHS  kernel.  This  trick,  borrowed  from  large-scale  learning,  permits  the  mech¬ 
anism  to  privately  respond  with  a  finite  representation  of  a  maximum-margin  hyperplane 
classifier.  We  then  establish  the  high-probability,  pointwise  similarity  between  the  resulting 
function  and  the  SVM  classifier  through  a  new  smoothness  result  of  regularized  ERM  with 
respect  to  perturbations  of  the  RKHS.  The  bounds  on  differential  privacy  and  utility  com¬ 
bine  to  upper  bound  the  optimal  differential  privacy  of  SVM  learning  for  hinge-loss.  This 
quantity  is  the  optimal  level  of  privacy  among  all  mechanisms  that  are  (e,  5)-useful  with 
respect  to  the  hinge- loss  SVM.  Finally,  we  derive  a  lower  bound  on  this  quantity  which 
establishes  that  any  mechanism  that  is  too  accurate  with  respect  to  the  hinge  SVM  with 
RBF  kernel,  with  any  non-trivial  probability,  cannot  be  /^-differentially  private  for  small  (3. 
The  lower  bounds  explicitly  depend  on  the  variance  of  the  RBF  kernel. 
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Part  II 

Applications  of  Machine  Learning  in 
Computer  Security 
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Chapter  5 

Learning-Based  Reactive  Security 


What’s  important  is  to  understand  the  delineation  between 
what’s  considered  “acceptable”  and  “unacceptable”  spending. 
The  goal  is  to  prevent  spending  on  reactive  security  “firefighting” . 
~  John  N.  Stewart,  VP  (Chief  Security  Officer),  Cisco  Systems 


Despite  the  conventional  wisdom  that  proactive  security  is  superior  to  reactive  security, 
this  chapter  aims  to  show  that  reactive  security  can  be  competitive  with  proactive  security 
as  long  as  the  reactive  defender  learns  from  past  attacks  instead  of  myopically  overreacting 
to  the  last  attack.  A  proposed  game-theoretic  model  follows  common  practice  in  the  security 
literature  by  making  worst-case  assumptions  about  the  attacker:  we  grant  the  attacker  com¬ 
plete  knowledge  of  the  defender’s  strategy  and  do  not  require  the  attacker  to  act  rationally. 
In  this  model,  we  bound  the  competitive  ratio  between  a  reactive  defense  algorithm  (which 
is  inspired  by  online  learning  theory)  and  the  best  fixed  proactive  defense.  Additionally,  we 
show  that,  unlike  proactive  defenses,  this  reactive  strategy  is  robust  to  a  lack  of  information 
about  the  attacker’s  incentives  and  knowledge. 

The  learning-based  risk  management  strategy  developed  in  this  chapter  faces  an  attacker 
that  can  manipulate  both  the  training  and  test  data  in  an  attempt  to  maximize  her  profit 
or  multiplicative  return  on  investment — Targeted  Causative  and  Exploratory  attacks  in  the 
language  of  the  taxonomy  overviewed  in  Sectionp..2.2[  Our  worst-case  analysis,  which  grants 


the  attacker  complete  control  over  the  data  and  complete  knowledge  of  the  learner,  shows 
that  relative  to  all  hxed  proactive  defenders,  the  reactive  strategy  asymptotically  performs 

well. 


5.1  Introduction 

Many  enterprises  employ  a  Chief  Information  Security  Officer  (CISO)  to  manage  the 
enterprise’s  information  security  risks.  Typically,  an  enterprise  has  many  more  security  vul¬ 
nerabilities  than  it  can  realistically  repair.  Instead  of  declaring  the  enterprise  “insecure” 
until  every  last  vulnerability  is  plugged,  CISOs  typically  perform  a  cost-benefit  analysis 
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to  identify  which  risks  to  address,  but  what  constitutes  an  effective  CISO  strategy?  The 


conventional  wisdom  (Kark  et  ah,  2009  Pironti,  2005)  is  that  CISOs  ought  to  adopt  a 
“forward-looking”  proactive  approach  to  mitigating  security  risk  by  examining  the  enter¬ 
prise  for  vulnerabilities  that  might  be  exploited  in  the  future.  Advocates  of  proactive  risk 
management  often  equate  reactive  security  with  myopic  bug-chasing  and  consider  it  inef¬ 
fective.  We  establish  sufficient  conditions  for  when  reacting  strategically  to  attacks  is  as 
effective  in  discouraging  attackers. 

We  study  the  efficacy  of  reactive  strategies  in  an  economic  model  of  the  CISO’s  secu¬ 
rity  cost-benefit  trade-offs.  Unlike  previously  proposed  economic  models  of  security  (see 
Section  5.1.1),  we  do  not  assume  the  attacker  acts  according  to  a  fixed  probability  distribu¬ 
tion.  Instead,  we  consider  a  game-theoretic  model  with  a  strategic  attacker  who  responds 
to  the  defender’s  strategy.  As  is  standard  in  the  security  literature,  we  make  worst-case 
assumptions  about  the  attacker.  For  example,  we  grant  the  attacker  complete  knowledge  of 
the  defender’s  strategy  and  do  not  require  the  attacker  to  act  rationally.  Further,  we  make 
conservative  assumptions  about  the  reactive  defender’s  knowledge  and  do  not  assume  the 
defender  knows  all  the  vulnerabilities  in  the  system  or  the  attacker’s  incentives.  However, 
we  do  assume  that  the  defender  can  observe  the  attacker’s  past  actions,  for  example  via  an 
intrusion  detection  system  or  user  metrics  (Beard,  2008). 

In  our  model,  we  find  that  two  properties  are  sufficient  for  a  reactive  strategy  to  perform 
as  well  as  the  best  proactive  strategies.  First,  no  single  attack  is  catastrophic,  meaning  the 
defender  can  survive  a  number  of  attacks.  This  is  consistent  with  situations  where  intrusions 
(that,  say,  steal  credit  card  numbers)  are  regrettable  but  not  business-ending.  Second,  the 
defender’s  budget  is  liquid,  meaning  the  defender  can  re-allocate  resources  without  penalty. 
For  example,  a  CISO  can  reassign  members  of  the  security  team  from  managing  firewall 
rules  to  improving  database  access  controls  at  relatively  low  switching  costs. 

Because  our  model  abstracts  many  vulnerabilities  into  a  single  graph  edge,  we  view  the 
act  of  defense  as  increasing  the  attacker’s  cost  for  mounting  an  attack  instead  of  preventing 
the  attack  {e.g.,  by  patching  a  single  bug).  By  making  this  assumption,  we  choose  not  to 
study  the  tactical  patch-by-patch  interaction  of  the  attacker  and  defender.  Instead,  we  model 
enterprise  security  at  a  more  abstract  level  appropriate  for  the  CISO.  For  example,  the  CISO 
might  allocate  a  portion  of  his  or  her  budget  to  engage  a  consultancy,  such  as  WhiteHat 
or  iSEC  Partners,  to  find  and  fix  cross-site  scripting  in  a  particular  web  application  or  to 
require  that  employees  use  SecurlD  tokens  during  authentication.  We  make  the  technical 
assumption  that  attacker  costs  are  linearly  dependent  on  defense  investments  locally.  This 
assumption  does  not  reflect  patch-by-patch  interaction,  which  would  be  better  represented 
by  a  step  function  (with  the  step  placed  at  the  cost  to  deploy  the  patch).  Instead,  this 
assumption  reflects  the  CISO’s  higher-level  viewpoint  where  the  staircase  of  summed  step 
functions  fades  into  a  slope. 

We  evaluate  the  defender’s  strategy  by  measuring  the  attacker’s  cumulative  return-on- 


investment,  the  return- on- attack  (ROA),  which  has  been  proposed  previously  (Cremonini 


2005).  By  studying  this  metric,  we  focus  on  defenders  who  seek  to  “cut  off  the  attacker’s 
oxygen,”  that  is  to  reduce  the  attacker’s  incentives  for  attacking  the  enterprise.  We  do 
not  distinguish  between  “successful”  and  “unsuccessful”  attacks.  Instead,  we  compare  the 
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payoff  the  attacker  receives  from  his  or  her  nefarious  deeds  with  the  cost  of  performing  said 
deeds.  We  imagine  that  sufficiently  disincentivized  attackers  will  seek  alternatives,  such  as 
attacking  a  different  organization  or  starting  a  legitimate  business. 

In  our  main  result,  we  show  sufficient  conditions  for  a  learning-based  reactive  strategy  to 
be  competitive  with  the  best  fixed  proactive  defense  in  the  sense  that  the  competitive  ratio 
between  the  reactive  ROA  and  the  proactive  ROA  is  at  most  1-t-e,  for  all  e  >  0,  provided  the 
game  lasts  sufficiently  many  rounds  (at  least  r2(l/e)).  To  prove  our  theorems,  we  draw  on 
techniques  from  the  online  learning  literature.  We  extend  these  techniques  to  the  case  where 
the  learner  does  not  know  all  the  game  matrix  rows  a  priori,  letting  us  analyze  situations 
where  the  defender  does  not  know  all  the  vulnerabilities  in  advance.  Although  our  main 
results  are  in  a  graph-based  model  with  a  single  attacker,  our  results  generalize  to  a  model 
based  on  Horn  clauses  with  multiple  attackers,  corresponding  to  hypergraph-based  models. 
Our  results  are  also  robust  to  switching  from  ROA  to  attacker  profit  and  to  allowing  the 
proactive  defender  to  revise  the  defense  allocation  a  fixed  number  of  times. 

Although  myopic  bug  chasing  is  most  likely  an  ineffective  reactive  strategy,  we  find  that 
in  some  situations  a  strategic  reactive  strategy  is  as  effective  as  the  optimal  fixed  proactive 
defense.  In  fact,  we  find  that  the  natural  strategy  of  gradually  reinforcing  attacked  edges 
by  shifting  budget  from  unattacked  edges  “learns”  the  attacker’s  incentives  and  constructs 
an  effective  defense.  Such  a  strategic  reactive  strategy  is  both  easier  to  implement  than 
a  proactive  strategy — because  it  does  not  presume  that  the  defender  knows  the  attacker’s 
intent  and  capabilities — and  is  less  wasteful  than  a  proactive  strategy  because  the  defender 
does  not  expend  budget  on  attacks  that  do  not  actually  occur.  Based  on  our  results,  we 
encourage  CISOs  to  question  the  assumption  that  proactive  risk  management  is  inherently 
superior  to  reactive  risk  management. 


Chapter  Organization.  The  remainder  of  this  section  relates  related  work.  Section  |5.2 
formalizes  our  model.  Section  5.3  shows  that  perimeter  defense  and  defense-in-depth  arise 
naturally  in  our  model.  Section  5H  presents  our  main  results  of  the  chapter  bounding 
the  competitive  ratio  of  reactive  versus  proactive  defense  strategies.  Section  5A  outlines 
scenarios  in  which  reactive  security  out-performs  proactive  security.  Section  5T  generalizes 
our  results  to  Horn  clauses  and  multiple  attackers.  Section  5.7  concludes  the  chapter  with 
a  short  summary  of  the  main  contributions. 


5.1.1  Related  Work 


Anderson| (2001 )  and  Varian  (2000)  informally  discuss  (via  anecdotes)  how  the  design  of 


information  security  must  take  incentives  into  account.  August  and  Tunca  (2006)  compare 


various  ways  to  incentivize  users  to  patch  their  systems  in  a  setting  where  the  users  are 
more  susceptible  to  attacks  if  their  neighbors  do  not  patch. 


Gordon  and  Loeb  (2002)  and  Hausken  (2006)  analyze  the  costs  and  benefits  of  security 


in  an  economic  model  (with  non-strategic  attackers)  where  the  probability  of  a  successful 
exploit  is  a  function  of  the  defense  investment.  They  use  this  model  to  compute  the  optimal 
level  of  investment.  Varian  (2001)  studies  various  (single-shot)  security  games  and  identihes 
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Figure  5.1:  An  attack  graph  representing  an  enterprise  data  center. 


how  much  agents  invest  in  security  at  equilibrium, 
model  by  letting  agents  self-insure. 


Grossklags  et  ah  (2008)  extends  this 


Miura-Ko  et  al.  (2008)  study  externalities  that  appear  due  to  users  having  the  same 


password  across  various  websites  and  discuss  pareto-improving  security  investments.  Miura- 
Ko  and  Bambos  (2007)  rank  vulnerabilities  according  to  a  random-attacker  model.  Skybox 


and  RedSeal  offer  practical  systems  that  help  enterprises  prioritize  vulnerabilities  based  on 
a  random-attacker  model.  Kumar  et  al.  (2008)  investigate  optimal  security  architectures 


for  a  multi-division  enterprise,  taking  into  account  losses  due  to  lack  of  availability  and 
conhdentiality.  None  of  the  above  papers  explicitly  model  a  truly  adversarial  attacker. 


Fultz  and  Grossklags  (2009)  generalizes  (Grossklags  et  ah,  2008)  by  modeling  attackers 


explicitly.  Gavusoglu  et  al.  (2008)  highlight  the  importance  of  using  a  game-theoretic  model 


over  a  decision  theoretic  model  due  to  the  presence  of  adversarial  attackers.  However,  these 


models  look  at  idealized  settings  that  are  not  generically  applicable.  Lye  and  Wing  (2002) 


study  the  Nash  equilibrium  of  a  single-shot  game  between  an  attacker  and  a  defender  that 
models  a  particular  enterprise  security  scenario.  Arguably  this  model  is  most  similar  to  ours 
in  terms  of  abstraction  level.  However,  calculating  the  Nash  equilibrium  requires  detailed 
knowledge  of  the  adversary’s  incentives,  which  as  discussed  in  the  introduction,  might  not 
be  readily  available  to  the  defender.  Moreover,  their  game  contains  multiple  equilibria, 
weakening  their  prescriptions. 


5.2  Formal  Model 


In  this  section,  we  present  a  game-theoretic  model  of  attack  and  defense.  Unlike  tra¬ 
ditional  bug-level  attack  graphs,  our  model  is  meant  to  capture  a  managerial  perspective 
on  enterprise  security.  The  model  is  somewhat  general  in  the  sense  that  attack  graphs  can 
represent  a  number  of  concrete  situations,  including  a  network  (see  Figure  [5T|,  components 
in  a  complex  software  system  (Fisher,  2008),  or  an  Internet  Fraud  “Battleheld”  (Friedberg 


2007). 


118 


5.2.1  System 


We  model  a  system  using  a  directed  graph  {V,E),  which  defines  the  game  between  an 
attacker  and  a  defender.  Each  vertex  n  G  E  in  the  graph  represents  a  state  of  the  system. 
Each  edge  e  E  E  represents  a  state  transition  the  attacker  can  induce.  For  example,  a  vertex 
might  represent  whether  a  particular  machine  in  a  network  has  been  compromised  by  an 
attacker.  An  edge  from  one  machine  to  another  might  represent  that  an  attacker  who  has 
compromised  the  first  machine  might  be  able  to  compromise  the  second  machine  because 
the  two  are  connected  by  a  network.  Alternatively,  the  vertices  might  represent  different 
components  in  a  software  system.  An  edge  might  represent  that  an  attacker  sending  input 
to  the  first  component  can  send  input  to  the  second. 

In  attacking  the  system,  the  attacker  selects  a  path  in  the  graph  that  begins  with  a 
designated  start  vertex  s.  Our  results  hold  in  more  general  models  {e.g.,  based  on  Horn 
clauses),  but  we  defer  discussing  such  generalizations  until  Section  5.6  We  think  of  the 


attack  as  driving  the  system  through  the  series  of  state  transitions  indicated  by  the  edges 
included  in  the  path.  In  the  networking  example  in  Figure  5.1,  an  attacker  might  first 


compromise  a  front-end  server  and  then  leverage  the  server’s  connectivity  to  the  back-end 
database  server  to  steal  credit  card  numbers  from  the  database. 


Incentives  and  Rewards.  Attackers  respond  to  incentives.  For  example,  attackers  com¬ 
promise  machines  and  form  botnets  because  they  make  money  from  spam  (]Kanich  et  al. 


2008)  or  rent  the  botnet  to  others  (Warner,  2004).  Other  attackers  steal  credit  card  numbers 
because  credit  card  numbers  have  monetary  value  (Franklin  et  ah,  2007).  We  model  the 


attacker’s  incentives  by  attaching  a  non-negative  reward  to  each  vertex.  These  rewards  are 
the  utility  the  attacker  derives  from  driving  the  system  into  the  state  represented  by  the 
vertex.  For  example,  compromising  the  database  server  might  have  a  sizable  reward  because 
the  database  server  contains  easily  monetizable  credit  card  numbers.  We  assume  the  start 
vertex  has  zero  reward,  forcing  the  attacker  to  undertake  some  action  before  earning  utility. 
Whenever  the  attacker  mounts  an  attack,  the  attacker  receives  a  payoff  equal  to  the  sum 
of  the  rewards  of  the  vertices  visited  in  the  attack  path:  payoff(a)  = 


the  example  from  Figure  5.1,  if  an  attacker  compromises  both  a  front-end  server  and  the 


database  server,  the  attacker  receives  both  rewards. 


Attack  Surface  and  Cost.  The  defender  has  a  fixed  defense  budget  R  >  0,  which  the 
defender  can  divide  among  the  edges  in  the  graph  according  to  a  defense  allocation  d:  for 
all  e  E  E,  d{e)  >  0  and 

The  defender’s  allocation  of  budget  to  various  edges  corresponds  to  the  decisions  made 
by  the  Chief  Information  Security  Officer  (CISO)  about  where  to  allocate  the  enterprise’s  se¬ 
curity  resources.  For  example,  the  CISO  might  allocate  organizational  headcount  to  fuzzing 
enterprise  web  applications  for  XSS  vulnerabilities.  These  kinds  of  investments  are  contin¬ 
uous  in  the  sense  that  the  CISO  can  allocate  1/4  of  a  full-time  employee  to  worrying  about 
XSS.  We  denote  the  set  of  feasible  allocations  of  budget  B  on  edge  set  E  by  Vb^e- 

By  defending  an  edge,  the  defender  makes  it  more  difficult  for  the  attacker  to  use  that 
edge  in  an  attack.  Each  unit  of  budget  the  defender  allocates  to  an  edge  raises  the  cost 
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that  the  attacker  must  pay  to  use  that  edge  in  an  attack.  Each  edge  has  an  attack  sur¬ 
face  (Howard,  2004)  w  that  represents  the  difficulty  in  defending  against  that  state  tran¬ 
sition.  For  example,  a  server  that  runs  both  Apache  and  Sendmail  has  a  larger  attack 
surface  than  one  that  runs  only  Apache  because  defending  the  hrst  server  is  more  difficult 
than  the  second.  Formally,  the  attacker  must  pay  the  following  cost  to  traverse  the  edge: 
cost(a,  d)  =  Allocating  defense  budget  to  an  edge  does  not  “reduce”  an 

edge’s  attack  surface.  For  example,  consider  defending  a  hallway  with  bricks.  The  wider 
the  hallway  (the  larger  the  attack  surface),  the  more  bricks  (budget  allocation)  required  to 
build  a  wall  of  a  certain  height  (the  cost  to  the  attacker). 

In  this  formulation,  the  function  mapping  the  defender’s  budget  allocation  to  attacker 
cost  is  linear,  preventing  the  defender  from  ever  fully  defending  an  edge.  Our  use  of  a 
linear  function  reflects  a  level  of  abstraction  more  appropriate  to  a  CISO  who  can  never 
fully  defend  assets,  which  we  justify  by  observing  that  the  rate  of  vulnerability  discovery 
in  a  particular  piece  of  software  is  roughly  constant  (Rescorla,  2005).  At  a  lower  level  of 


detail,  we  might  replace  this  function  with  a  step  function,  indicating  that  the  defender  can 
“patch”  a  vulnerability  by  allocating  a  threshold  amount  of  budget. 


5.2.2  Objective 


To  evaluate  defense  strategies,  we  measure  the  attacker’s  incentive  for  attacking  using 
the  return- on- attack  (ROA)  (Cremonini,  2005),  which  we  dehne  as  follows: 


ROA(a,  d) 


payoff  (a) 
cost  (a,  d) 


We  use  this  metric  for  evaluating  defense  strategy  because  we  believe  that  if  the  defender 
lowers  the  ROA  sufficiently,  the  attacker  will  be  discouraged  from  attacking  the  system  and 
will  End  other  uses  for  his  or  her  capital  or  industry.  For  example,  the  attacker  might  decide 
to  attack  another  system.  Analogous  results  hold  if  we  quantify  the  attacker’s  incentives  in 
terms  of  proht  {e.g.,  with  pro£t(a,(i)  =  payoff(a)  —  cost{a,d)),  but  we  focus  on  ROA  for 
simplicity. 

A  purely  rational  attacker  will  mount  attacks  that  maximize  ROA.  However,  a  real 
attacker  might  not  maximize  ROA.  For  example,  the  attacker  might  not  have  complete 
knowledge  of  the  system  or  its  defense.  We  strengthen  our  results  by  considering  all  attacks, 
not  just  those  that  maximize  ROA. 


5.2.3  Proactive  Security 

We  evaluate  our  learning-based  reactive  approach  by  comparing  it  against  a  proactive 
approach  to  risk  management  in  which  the  defender  carefully  examines  the  system  and 
constructs  a  defense  in  order  to  fend  off  future  attacks.  We  strengthen  this  benchmark  by 
providing  the  proactive  defender  complete  knowledge  about  the  system,  but  we  require  that 
the  defender  commit  to  a  hxed  strategy.  To  strengthen  our  results,  we  state  our  main  result 
in  terms  of  all  such  proactive  defenders.  In  particular,  this  class  of  defenders  includes  the 
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rational  proactive  defender  who  employs  a  defense  allocation  that  minimizes  the  maximnm 
ROA  the  attacker  can  extract  from  the  system:  argmin^max^  ROA(a,  d). 


5.3  Case  Studies 

In  this  section,  we  describe  instances  of  our  model  to  build  the  reader’s  intuition.  These 
examples  illustrate  that  some  familiar  security  concepts,  including  perimeter  defense  and 
defense  in  depth,  arise  naturally  as  optimal  defenses  in  our  model.  These  defenses  can 
be  constructed  either  by  rational  proactive  attackers  or  converged  to  by  a  learning-based 
reactive  defense. 


5.3.1  Perimeter  Defense 


Consider  a  system  in  which  the  attacker’s  reward  is  non-zero  at  exactly  one  vertex,  t. 
For  example,  in  a  medical  system,  the  attacker’s  reward  for  obtaining  electronic  medical 
records  might  well  dominate  the  value  of  other  attack  targets  such  as  employees’  vacation 
calendars.  In  such  a  system,  a  rational  attacker  will  select  the  minimum-cost  path  from  the 
start  vertex  s  to  the  valuable  vertex  t.  The  optimal  defense  limits  the  attacker’s  ROA  by 
maximizing  the  cost  of  the  minimum  s-t  path.  The  algorithm  for  constructing  this  defense 


is  straightforward  (Chakrabarty  et  ah,  2006): 


1.  Let  C  be  the  minimum  weight  s-t  cut  in  {V,E,w). 

2.  Select  the  following  defense: 


d{e)  = 


Bw{e)/Z  if  e  G  C 


0 


otherwise 


,  where  Z  =  . 


esc 


Notice  that  this  algorithm  constructs  a  perimeter  defense:  the  defender  allocates  the  entire 
defense  budget  to  a  single  cut  in  the  graph.  Essentially,  the  defender  spreads  the  defense 
budget  over  the  attack  surface  of  the  cut.  By  choosing  the  minimum- weight  cut,  the  defender 
is  choosing  to  defend  the  smallest  attack  surface  that  separates  the  start  vertex  from  the 
target  vertex.  Real  defenders  use  similar  perimeter  defenses,  for  example,  when  they  install 
a  firewall  at  the  boundary  between  their  organization  and  the  Internet  because  the  network’s 
perimeter  is  much  smaller  than  its  interior. 


5.3.2  Defense  in  Depth 


Many  experts  in  security  practice  recommend  that  defenders  employ  defense  in  depth. 
Defense  in  depth  rises  naturally  in  our  model  as  an  optimal  defense  for  some  systems.  Con¬ 
sider,  for  example,  the  system  depicted  in  Figure  |5.2[  This  attack  graph  is  a  simplified 


version  of  the  data  center  network  depicted  in  Figure  5.1[  Although  the  attacker  receives 
the  largest  reward  for  compromising  the  back-end  database  server,  the  attacker  also  receives 
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Internet  Front  End  Database 

Figure  5.2:  Attack  graph  representing  a  simplified  data  center  network. 


some  reward  for  compromising  the  front-end  web  server.  Moreover,  the  front-end  web  server 
has  a  larger  attack  surface  than  the  back-end  database  server  because  the  front-end  server  ex¬ 
poses  a  more  complex  interface  (an  entire  enterprise  web  application),  whereas  the  database 
server  exposes  only  a  simple  SQL  interface.  Allocating  defense  budget  to  the  left-most  edge 
represents  trying  to  protect  sensitive  database  information  with  a  complex  web  application 
firewall  instead  of  database  access  control  lists  (he.,  possible,  but  economically  inefficient). 

The  optimal  defense  against  a  rational  attacker  is  to  allocate  half  of  the  defense  budget 
to  the  left-most  edge  and  half  of  the  budget  to  the  right-most  edge,  limiting  the  attacker 
to  a  ROA  of  unity.  Shifting  the  entire  budget  to  the  right-most  edge  (he.,  defending  only 
the  database)  is  disastrous  because  the  attacker  will  simply  attack  the  front-end  at  zero 
cost,  achieving  an  unbounded  ROA.  Shifting  the  entire  budget  to  the  left-most  edge  is  also 
problematic  because  the  attacker  will  attack  the  database  (achieving  an  ROA  of  5). 


5.4  Reactive  Security 


To  analyze  reactive  security,  we  model  the  attacker  and  defender  as  playing  an  iterative 
game,  alternating  moves.  First,  the  defender  selects  a  defense,  and  then  the  attacker  selects 
an  attack.  We  present  a  learning-based  reactive  defense  strategy  that  is  oblivious  to  vertex 
rewards  and  to  edges  that  have  not  yet  been  used  in  attacks.  We  prove  a  theorem  bounding 
the  competitive  ratio  between  this  reactive  strategy  and  the  best  proactive  defense  via  a 
series  of  reductions  to  results  from  the  online  learning  theory  literature.  Other  applications 


of  this  literature  include  managing  stock  portfolios  (Ordentlich  and  Cover,  1998),  playing 


zero-sum  games  ( 

Freund  and  Schapire,  1999b),  and  boosting  other  machine  learning  heuris- 

tics  ( 

Freund  and  Schapire 

1999a 

).  Although  we  provide  a  few  technical  extensions,  our 

main  contribution  comes  from  applying  results  from  online  learning  to  risk  management. 


Repeated  Game.  We  formalize  the  repeated  game  between  the  defender  and  the  attacker 
as  follows.  In  each  round  t  from  1  to  T: 

1.  The  defender  chooses  defense  allocation  dt{e)  over  the  edges  e  ^  E. 

2.  The  attacker  chooses  an  attack  path  at  in  G. 

3.  The  path  at  and  attack  surfaces  {w{e)  :  e  G  Ot}  are  revealed  to  the  defender. 

4.  The  attacker  pays  cost{at,dt)  and  gains  payoff(at). 
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Algorithm  13  A  reactive  defense  strategy  for  hidden  edges. 
•  Initialize  Eq  =  ^ 


•  For  each  ronnd  t  G  {2, 


-  Let  Et-i  =  Et-2  U  E{at-i) 

—  For  each  e  G  let 


St-i{e) 

h(e) 

P,(e) 


I  *S't_2(e)  +  M(e,  ttt-i) 

aSt-i{e) 

Pt-l 


Pt(e) 


if  e  G  Et-2 
otherwise. 


where  M{e,a)  =  —1  [e  G  a]  /w{e)  is  a  matrix  with  |F^|  rows  and  a  colnmn  for 
each  attack. 


In  each  ronnd,  we  let  the  attacker  choose  the  attack  path  after  the  defender  commits  to  the 
defense  allocation  becanse  the  defender’s  bndget  allocation  is  not  a  secret  (in  the  sense  of  a 
cryptographic  key).  Following  the  “no  secnrity  throngh  obscurity”  principle,  we  make  the 
conservative  assumption  that  the  attacker  can  accurately  determine  the  defender’s  budget 
allocation. 


Defender  Knowledge.  Unlike  proactive  defenders,  reactive  defenders  do  not  know  all  of 
the  vulnerabilities  that  exist  in  the  system  in  advance.  (If  defenders  had  complete  knowl¬ 
edge  of  vulnerabilities,  conferences  such  as  Black  Hat  Briehngs  would  serve  little  purpose.) 
Instead,  we  reveal  an  edge  (and  its  attack  surface)  to  the  defender  after  the  attacker  uses  the 
edge  in  an  attack.  For  example,  the  defender  might  monitor  the  system  and  learn  how  the 
attacker  attacked  the  system  by  doing  a  post-mortem  analysis  of  intrusion  logs.  Formally, 
we  dehne  a  reactive  defense  strategy  to  be  a  function  from  attack  sequences  {at}  and  the 
subsystem  induced  by  the  edges  contained  in  [J^  a*  to  defense  allocations  such  that  d{e)  =  0 
if  edge  e  ^  Ui®*-  Notice  that  this  requires  the  defender’s  strategy  to  be  oblivious  to  the 
system  beyond  the  edges  used  by  the  attacker. 


5.4.1  Algorithm 

Algorithm  is  a  reactive  defense  strategy  based  on  the  multiplicative  update  learning 


algorithm  ( Cesa-Bianchi  et  al. ,  1997  Freund  and  Schapire,  1999b).  The  algorithm  reinforces 
edges  on  the  attack  path  multiplicatively,  taking  the  attack  surface  into  account  by  allocating 
more  budget  to  easier-to-defend  edges.  When  new  edges  are  revealed,  the  algorithm  re¬ 
allocates  budget  uniformly  from  the  already-revealed  edges  to  the  newly  revealed  edges.  We 
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state  the  algorithm  in  terms  of  a  normalized  defense  allocation  Pt{e)  =  dt{e)/B.  Notice  that 
this  algorithm  is  oblivious  to  unattacked  edges  and  the  attacker’s  reward  for  visiting  each 
vertex.  An  appropriate  setting  for  the  algorithm  parameters  fdt  G  [0, 1)  will  be  described 
below. 

The  algorithm  begins  without  any  knowledge  of  the  graph  whatsoever,  and  so  allocates 
no  defense  budget  to  the  system.  Upon  the  attack  on  the  system,  the  algorithm  updates 
Et  to  be  the  set  of  edges  revealed  up  to  this  point,  and  updates  St{e)  to  be  a  weight  count 
of  the  number  of  times  e  has  been  used  in  an  attack  thus  far.  For  each  edge  that  has  ever 
been  revealed,  the  defense  allocation  Pf+i(e)  is  chosen  to  be  normalized  to  sum  to 

unity  over  all  edges  e  ^  Et.  In  this  way,  any  edge  attacked  in  round  t  will  have  its  defense 
allocation  reinforced. 

The  parameter  /3  controls  how  aggressively  the  defender  reallocates  defense  budget  to 
recently  attacked  edges.  If  (3  is  infinitesimal,  the  defender  will  move  the  entire  defense 
budget  to  the  edge  on  the  most  recent  attack  path  with  the  smallest  attack  surface.  If  (3  is 
enormous,  the  defender  will  not  be  very  agile  and,  instead,  leave  the  defense  budget  in  the 
initial  allocation.  For  an  appropriate  value  of  /3,  the  algorithm  will  converge  to  the  optimal 


defense  strategy.  For  instance,  the  min  cut  in  the  example  from  Section  5.3.1 


5.4.2  Main  Theorems 

To  compare  this  reactive  defense  strategy  to  all  proactive  defense  strategies,  we  use  the 
notion  of  regret  from  online  learning  theory.  The  following  is  an  additive  regret  bound 
relating  the  attacker’s  profit  under  reactive  and  proactive  defense  strategies. 

Theorem  45.  The  average  attacker  profit  against  Algorithm  converges  to  the  average 
attacker  profit  against  the  best  proactive  defense.  Formally,  if  defense  allocations 

are  output  by  Algorithm 

any  system  (V,  E,w,  reward,  s)  revealed  online  and  any  attack  seguence  then 


with  parameter  seguence  /3s  =  +  ■\/2  log  \Es\/{s  +  1)  j 


on 


profit  (ai,di) 

t=i 


1 

T 


T 

profit  ( a* ,  d*) 

t=i 


<  B 


log|P|  B{\og\E\  +  w~^) 


2T 


+ 


for  all  proactive  defense  strategies  d*  G  T>b,e  where  w  ^  =  \E\  ^  mean  of 

the  surface  reciprocals. 

Remark  46.  We  can  interpret  Theorem\4^  as  establishing  sufficient  conditions  under  which 
a  reactive  defense  strategy  is  within  an  additive  constant  of  the  best  proactive  defense  strategy. 
Instead  of  carefully  analyzing  the  system  to  construct  the  best  proactive  defense,  the  defender 
need  only  react  to  attacks  in  a  principled  manner  to  achieve  almost  the  same  guality  of 
defense  in  terms  of  attacker  profit. 

Reactive  defense  strategies  can  also  be  competitive  with  proactive  defense  strategies 
when  we  consider  an  attacker  motivated  by  return  on  attack  (ROA).  The  ROA  formulation 
is  appealing  because  (unlike  with  profit)  the  objective  function  does  not  require  measuring 
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attacker  cost  and  defender  bndget  in  the  same  nnits.  The  next  result  considers  the  com¬ 
petitive  ratio  between  the  ROA  for  a  reactive  defense  strategy  and  the  ROA  for  the  best 
proactive  defense  strategy. 


Theorem  47.  The  ROA  against  Algorithm\^  converges  to  the  ROA  against  best  proactive 
defense.  Formally,  consider  the  cumulative  ROA: 


ROA 


Er=i  payoff  (qQ 

cost{at,dt) 


(We  abuse  notation  slightly  and  use  singleton  arguments  to  represent  the  corresponding 
constant  seguence.)  If  defense  allocations  are  output  by  Algorithm\r^  with  parameters 

(Is  =  +  ■\/2  log  \Es\/{s  +  1)^  on  any  system  (V,  E,  w,  reward,  s)  revealed  online,  such 

that  \E\  >  1,  and  any  attack  seguence  then  for  all  a  >  0  and  proactive  defense 

strategies  d*  G  Vb^e 


ROA  ({a, I’Ll,  (i*) 


provided  T  is 


sufficiently  large 


0 


<  1  -|-  , 


Remark  48.  Notice  that  the  reactive  defender  can  use  the  same  algorithm  regardless  of 
whether  the  attacker  is  motivated  by  profit  or  by  ROA.  As  discussed  in  Section  R5  the 
optimal  proactive  defense  is  not  similarly  robust. 


5.4.3  Proofs  of  the  Main  Theorems 


We  now  describe  a  series  of  reductions  that  establish  the  main  results.  First,  we  prove 
Theorem  ^  in  the  simpler  setting  where  the  defender  knows  the  entire  graph.  Second,  we 
remove  the  hypothesis  that  the  defender  knows  the  edges  is  advance.  Finally,  we  extend  our 
results  to  ROA. 


5. 4. 3.1  Bound  on  Profit:  Known  Edges  Case 

Suppose  that  the  reactive  defender  is  granted  full  knowledge  of  the  system 
(R,  E,  w,  reward,  s)  from  the  outset.  Specihcally,  the  graph,  attack  surfaces,  and  rewards 
are  all  revealed  to  the  defender  prior  to  the  first  round.  Algorithm  is  a  reactive  defense 
strategy  that  makes  use  of  this  additional  knowledge. 

Lemma  49.  If  defense  allocations  {dt}f^i  are  output  by  Algorithm\If\  with  parameter  (3  = 
on  any  system  (y,E,w,  reward,  s)  and  attack  seguence  then 

1  T  ^  T 

-  profit(at,  dt)-  -  Y  Profit(ai,  d*)  <  B 

t=i  t=i 

Wo  wit:  T>  (1  +  a-i)  log|£;|. 


log  |E|  i?log|E| 


2T 


+ 
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Algorithm  14  Reactive  defense  strategy  for  known  edges  using  the  multiplicative  update 
algorithm. 

•  For  each  e  G  initialize  F’i(e)  = 

•  For  each  round  f  G  {2, . . . ,  T}  and  e  E  E,  let 

where 

e'&E 


for  all  proactive  defense  strategies  d*  G  Vb^e- 


The  lemma’s  proof  is  a  reduction  to  the  following  regret  bound  from  online  learning  (Fre¬ 


und  and  Schapire,  1999b,  Corollary  4). 


Theorem  50.  If  the  multiplicative  update  algorithm  (Algorithm^I^  is  run  with  any  game 
matrix  M  with  elements  in  [0, 1],  and  parameter  /?  =  ( 1  +  ^2  log  |P|/T 


T 


t=i 


at)  —  mm  ,  ^ 

^^*>0:Ee6BJ"*F)  =  l  1  T 


J2M{P\at) 


< 


t=i 


log  \E\ 
2T 


+ 


then 

log  \E\ 
T 


Proof  of  Lemma  Due  to  the  normalization  by  Zt,  the  sequence  of  defense  allocations 
{Pt}J^i  output  by  Algorithm  14  is  invariant  to  adding  a  constant  to  all  elements  of  matrix 
M.  Let  M'  be  the  matrix  obtained  by  adding  constant  C  to  all  entries  of  arbitrary  game 
matrix  M,  and  let  sequences  {Pt}^i  and  {P/}^]^  be  obtained  by  running  multiplicative 
update  with  matrix  M  and  M'  respectively.  Then,  for  all  e  G  P  and  f  G  [T  —  1] , 


PUe)  = 


Fi(e)/?S.i  «'(w) 

P^(e)/3ELiA^(e.ad 

Pt+i{e)  . 


In  particular  Algorithm  produces  the  same  defense  allocation  sequence  as  if  the  game 
matrix  elements  are  increased  by  one  to 


M'(e,a)  = 


l/w{e) 


if  e  G  a  . 
otherwise 
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Because  this  new  matrix  has  entries  in  [0, 1]  we  can  apply  Theorem  50  to  prove  for  the 
original  matrix  M  that 


T 


t=i 


t=i 


log|i?|  log|-E| 
2T  ^ 


T 


Now,  by  dehnition  of  the  original  game  matrix, 

M{Pt,at)  =  -{Pt{e)/w{e))  •  1  [e  G  at] 

eeE 

e£at 

e&at 

=  —B~^  cost{at,dt)  . 


Thus  Inequality  (5.1)  is  equivalent  to 

.  T 


^cost(ai,(ii)  -  ^mm  {-Tf^'^B  ^cost(at,d 


t=i 


t=l 


< 


log|i?|  log|i? 


+ 


2T  T 
Simple  algebraic  manipulation  yields 

T 


1 

T 

1 

T 

1 

T 

<  B 


pro£t(at,  dt)  —  min  <  —  y^  profit(at,d 
d*£VB,E  T 


t=l 

T 


t=l 


(5.1) 


E(payoff(ai)  —  cost(at,  dt))  —  min  <  —  'S~^  (payoff(at)  —  cost(at,  d*)) 

n*  T->  7-n  I  / 


t=l 

T 


d*&VB,E 
T 


t=l 


yy  (-cost(at,dt))  -  min  <  —  y^  (- cost(at,  d*)) 
d*e:'DB,E  I 


t=l 


t=l 


log|i?|  log|i?| 


2T 

completing  the  proof. 


+  B- 


T 


□ 


5. 4. 3. 2  Bound  on  Profit:  Hidden  Edges  Case 

The  standard  algorithms  in  online  learning  assume  that  the  rows  of  the  matrix  are  known 
in  advance.  Here,  the  edges  are  not  known  in  advance  and  we  must  relax  this  assumption 
using  a  simulation  argument,  which  is  perhaps  the  least  obvious  part  of  the  reduction. 
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The  defense  allocation  chosen  by  Algorithm  at  time  t  is  precisely  the  same  as  the 
defense  allocation  that  would  have  been  chosen  by  Algorithm  had  the  defender  run  Algo- 
rithm[^on  the  currently  visible  subgraph.  The  following  lemma  formalizes  this  equivalence. 
Note  that  Algorithm  [^s  parameter  is  reactive:  it  corresponds  to  Algorithm  [Tlfs  parame¬ 
ter,  but  for  the  subgraph  induced  by  the  edges  revealed  so  far.  That  is,  /3t  depends  only  on 
edges  visible  to  the  defender  in  round  t,  letting  the  defender  actually  run  the  algorithm  in 
practice! 


Lemma  51.  Consider  arbitrary  round  t  G  [T].  If  Algorithms\T^  and  If  are  run  with  parame¬ 
ters  Ps  =  \/2  log  \Es\/{s  -\- 1)  j  for  s  G  [t]  and  parameter  (3  = 

-I-  1^2  log  \Et\/{t  -\- 1)  j  respectively ,  with  the  latter  run  on  the  subgraph  induced  by  Et, 
then  the  defense  allocations  Pt+i{e)  output  by  the  algorithms  are  identical  for  all  e  &  Et. 

Proof.  If  e  G  then  i^(e)  =  because  [It  =  /d,  and  the  round  t  1  defense 


allocation  of  Algorithm  13  Pt+i  is  simply  Pt+i  normalized  to  sum  to  unity  over  edge  set  Et, 


which  is  exactly  the  defense  allocation  output  by  Algorithm  14 


□ 


Armed  with  this  correspondence,  we  show  that  Algorithm  13  is  almost  as  effective  as 


Algorithm  In  other  words,  hiding  unattached  edges  from  the  defender  does  not  cause 
much  harm  to  the  reactive  defender’s  ability  to  disincentivize  the  attacker. 


Lemma  52.  If  defense  allocations  {di^t}J=i  o,nd  {d2,t}'t=i  output  by  Algorithms\r^  and  If 
with  parameters  [It  =  ^1  -|-  \/2  log  \Et\/{t  + 1)  j  or  t  G  [T  —  1]  and  /3  = 

^1  -|-  \/2  log  \E\/{T)^  ,  respectively,  on  a  system  (V,  E,  w,  reward,  s)  and  attack  seguence 

then 


T  T 

1  V  1  -  ^ 

-  ^  pro£t(at,  di^t)  -  pro£t(at,  4,*)  < 


T 


t=i 


t=i 


B- 

- 

T 


Proof.  Consider  attack  at  from  a  round  f  G  [T]  and  consider  an  edge  e  &  at.  If  e  G  for 


some  s  <  t,  then  the  defense  budget  allocated  to  e  at  time  t  by  Algorithm  14  cannot  be 
greater  than  the  budget  allocated  by  Algorithm [T^  Thus,  the  instantaneous  cost  paid  by  the 
attacker  on  e  when  Algorithm [T^ defends  is  at  least  the  cost  paid  when  Algorithm [T4| defends: 
di,t(e)/tc(e)  >  d2^t{e)/w{e).  If  e  ^  Ul=i  then  for  all  s  G  [t],  di^s{e)  =  0,  by  dehnition.  The 
sequence  is  decreasing  and  positive.  Thus  maxg^t  d2,s{^)  —  di,s(e)  is  optimized 

at  s  =  1  and  is  equal  to  B/\E\.  Finally  because  each  edge  e  E  E  is  hrst  revealed  exactly 
once  this  leads  to 


^cost(ai,(i2,t)  -  ^cost(at,(ii,t)  = 


d2,t(e)  -  (ii,t(e) 


t=i 


t=i 


t=l  e£at 


w{e) 


< 


E 

eGE 


B 


E\w{e) 
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Combined  with  the  fact  that  the  attacker  receives  the  same  payout  whether  Algorithm  M 
or  Algorithm  [T^  defends  completes  the  result. 


□ 


Proof  of  Theorem  The  theorem’s  result  follow  immediately  from  combining  Lemma  [49 
and  Lemma  |52l 


□ 


Finally,  notice  that  Algorithm  enjoys  the  same  time  and  space  complexities  as  Algo¬ 
rithm  14,  up  to  constants. 


5. 4. 3. 3  Bound  on  ROA:  Hidden  Edges  Case 

We  now  translate  our  bounds  on  proht  into  bounds  on  ROA  by  observing  that  the  ratio  of 
two  quantities  is  small  if  the  quantities  are  large  and  their  difference  is  small.  We  consider 
the  competitive  ratio  between  a  reactive  defense  strategy  and  the  best  proactive  defense 
strategy  after  the  following  technical  lemma,  which  asserts  that  the  quantities  are  large. 

Lemma  53.  For  all  attack  sequences  max^^Agx^s  b  cost(at,  d*)  >  VT  where 

game  value  V  is  max^gD^  ^  min^  cost(a,  d)  =  ^ — - — pr  >  0,  where  inc(u)  C  E  denotes  the 
edges  incident  to  vertex  v. 

Proof.  Let  d*  =  argmax^g^^s  ^  cost(a,  d)  witness  the  game’s  value  V,  then 
maxrfgx>g  ^  cost  (at,  d)  >  cost  (at,  d*)  >  TV.  Consider  the  defensive  alloca¬ 
tion  for  each  e  E  E.  If  e  G  inc(s),  let  d(e)  =  Bw{e)/  J2e&nc{s)'^(^)  ^  otherwise 

d(e)  =  0.  This  allocation  is  feasible  because 


^d(e) 

eeE 


-SEeginc(.)^(e) 
X]eeinc(s)  '^(c) 

B  . 


By  dehnition  d{e)/w{e)  =  B /  edge  e  incident  to  s.  Therefore, 

cost(a,  d)  >  B /  J2eemcis)'^i^)  non-trivial  attack  a,  which  necessarily  includes  at 

least  one  s-incident  edge.  Finally,  V  >  mina  cost(a,  d)  proves 


V  > 


B 

X/eeinc(s)  '^(s) 


(5.2) 


Now,  consider  a  defense  allocation  d  and  £x  an  attack  a  that  minimizes  the  total  attacker 
cost  under  d.  At  most  one  edge  e  G  a  can  have  d(e)  >  0,  for  otherwise  the  cost  under  d  can 
be  reduced  by  removing  an  edge  from  a.  Moreover  any  attack  a  G  arg min^gj^^^^)  d{e)/w{e) 
minimizes  attacker  cost  under  d.  Thus  the  maximin  V  is  witnessed  by  defense  allocations 
that  maximize  mineginc(s)  d{e)/w{e).  This  maximization  is  achieved  by  allocation  d  and  so 
Inequality  (5.2)  is  an  equality.  □ 


We  are  now  ready  to  prove  the  main  ROA  theorem: 
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Proof  of  Theorem^^  First,  observe  that  for  all  -B  >  0  and  all  A,C  E 

^  A-  B  <  {C  -l)B  . 


(5.3) 


We  will  use  this  equivalence  to  convert  the  regret  bound  on  proht  to  the  desired  bound  on 
ROA.  Together  Theorem  45  and  Lemma  imply 


a 


>  a 


cost  {at,  dt) 

t  =  l 

T 

max  cost(ai,  d*)  —  a  —  \/T log  \E\  —  aB  (log  \E\  +  w~A 
''&'Db.e  ^ i  2  V  / 


d*&VB 


t=l 

B 


>  aVT  —  a  —  y/Tlog  \E\  —  aB  ^log  \E\  + 
where  V  =  maxdeVs  e  cost  (a,  d)  >  0.  If 


w 


-1 


(5.4) 


1  Q 

Vf  >  —  (1  +  a-^)  y\og\E\  w{e)  , 

eSinc(s) 


we  can  use  inequalities  V  =  -B/ X^eeinc(s) —  21og|i?|  (since  \E\  >  1),  and 
(Eeeinc(.)  w^(e))  <  1  to  show 

Vf  >  (^(1  +  a)B  +  y/[(l  +  a)B  +  24aV]  (1  +  a)B^  {2V2aV)-^ ^/log \E\  , 


which  combines  with  Theorem  45  and  Inequality  5.4  to  imply 
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Finally,  combining  this  equation  with  Equivalence  5.3  yields  the  result 


minrf*g2p^  ^  ROA  d*) 

Ylt=iP^yoS{at,dt)  X]^^cost(at,d*) 

Y.t=i  cost(at,  dt)  d*&VB,E  payoff(at,  d*) 
_  maxrf*spB,E  Ef=i  cost(at,  d*) 

XlLi  cost{at,dt) 

<  1  +  a  . 


□ 


5.4.4  Lower  Bounds 


We  briefly  argue  the  optimality  of  Algorithm[^for  a  particular  graph,  he.,  we  show  that 
Algorithm  13  has  optimal  convergence  time  for  small  enough  a,  up  to  constants.  (For  very 


large  a.  Algorithm  13  converges  in  constant  time,  and  therefore  is  optimal  up  to  constants, 
vacuously.)  This  result  establishes  a  lower  bound  on  the  competitive  ratio  of  the  ROA  for  all 
reactive  strategies.  The  proof  gives  an  example  where  the  best  proactive  defense  (slightly) 
out-performs  every  reactive  strategy,  suggesting  the  benchmark  is  not  unreasonably  weak. 

The  argument  considers  an  attacker  who  randomly  selects  an  attack  path,  rendering 
knowledge  of  past  attacks  useless.  Consider  a  two-vertex  graph  where  the  start  vertex  s  is 
connected  to  a  vertex  r  (with  reward  1)  by  two  parallel  edges  ei  and  62,  each  with  an  attack 
surface  of  1.  Further  suppose  that  the  defense  budget  B  =  1.  We  first  show  a  lower  bound 
on  all  reactive  algorithms: 


Lemma  54.  for  all  reactive  algorithms  A,  the  competitive  ratio  C  is  at  least  {x  +  yi{\/T))/x, 
i.e.,  at  least  (T  -|-  Vt{\/T))/T  because  x  <T . 


Proof.  Consider  the  following  random  attack  sequence:  For  each  round,  select  an  attack 
path  uniform  i.i.d.  from  the  set  {61,62}.  A  reactive  strategy  must  commit  to  a  defense  in 
every  round  without  knowledge  of  the  attack,  and  therefore  every  strategy  that  expends 
the  entire  budget  of  1  inflicts  an  expected  cost  of  1/2  in  every  round.  Thus,  every  reactive 
strategy  inflicts  a  total  expected  cost  of  (at  most)  T/2,  where  the  expectation  is  over  the 
coin-tosses  of  the  random  attack  process. 

Given  an  attack  sequence,  however,  there  exists  a  proactive  defense  allocation  with  better 
performance.  We  can  think  of  the  proactive  defender  being  prescient  as  to  which  edge  (ci  or 
62)  will  be  attacked  most  frequently  and  allocating  the  entire  defense  budget  to  that  edge. 
It  is  well-known  (for  instance  via  an  analysis  of  a  one-dimensional  random  walk)  that  in 
such  a  random  process,  one  of  the  edges  will  occur  f2(-\/T)  more  often  than  the  other,  in 
expectation. 

By  the  probabilistic  method,  a  property  that  is  true  in  expectation  must  hold  existen¬ 
tially,  and,  therefore,  for  every  reactive  strategy  A,  there  exists  an  attack  sequence  such  that 
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A  has  a  cost  x,  whereas  the  best  proactive  strategy  (in  retrospect)  has  a  cost  x  +  ^(a/T). 
Because  the  payoff  of  each  attack  is  1,  the  total  reward  in  either  case  is  T.  The  prescient 
proactive  defender,  therefore,  has  an  ROA  of  T/(a:  +  ^(a/T)),  but  the  reactive  algorithm 
has  an  ROA  of  T/x,  establishing  the  lemma.  □ 


Given  this  lemma,  we  show  that  Algorithm [T^ is  optimal  given  the  information  available. 
In  this  case,  n  =  2  and,  ignoring  constants  from  Theorem  we  are  trying  to  match  a 
convergence  time  T  is  at  most  (1  +  which  is  approximately  a~‘^  for  small  a.  For 

large  enough  T,  there  exists  a  constant  c  such  that  C  >  {T  +  c\/T)/T.  By  simple  algebra, 
(T  +  c\/T)/T  >  1  +  a  whenever  T  <  concluding  the  argument. 

We  can  generalize  the  above  argument  of  optimality  to  n  >  2  using  the  combinatorial 


Lemma  3.2.1  from  ( Cesa-Bianchi  et  al. ,  1993).  Specihcally,  we  can  show  that  for  every  n, 


there  is  an  n  edge  graph  for  which  Algorithm  [T^  is  optimal  up  to  constants  for  small  enough 


a. 


5.5  Advantages  of  Reactivity 

In  this  section,  we  examine  some  situations  in  which  a  reactive  defender  out-performs 
a  proactive  defender.  Proactive  defenses  hinge  on  the  defender’s  model  of  the  attacker’s 
incentives.  If  the  defender’s  model  is  inaccurate,  the  defender  will  construct  a  proactive 
defense  that  is  far  from  optimal.  By  contrast,  a  reactive  defender  need  not  reason  about 
the  attacker’s  incentives  directly.  Instead,  the  reactive  defender  learns  these  incentives  by 
observing  the  attacker  in  action. 


Learning  Rewards.  One  way  to  model  inaccuracies  in  the  defender’s  estimates  of  the 
attacker’s  incentives  is  to  hide  the  attacker’s  rewards  from  the  defender.  Without  knowledge 
of  the  payoffs,  a  proactive  defender  has  difficulty  limiting  the  attacker’s  ROA.  Consider,  for 


example,  the  star  system  whose  edges  have  equal  attack  surfaces,  as  depicted  in  Figure  5.3 


Without  knowledge  of  the  attacker’s  rewards,  a  proactive  defender  has  little  choice  but  to 
allocate  the  defense  budget  equally  to  each  edge  (because  the  edges  are  indistinguishable). 
However,  if  the  attacker’s  reward  is  concentrated  at  a  single  vertex,  the  competitive  ratio  for 
attacker’s  ROA  (compared  to  the  rational  proactive  defense)  is  the  number  of  leaf  vertices. 
(We  can,  of  course,  make  the  ratio  worse  by  adding  more  vertices.)  By  contrast,  the  reac¬ 
tive  algorithm  we  analyze  in  Section  50  is  competitive  with  the  rational  proactive  defense 
because  the  reactive  algorithm  effectively  learns  the  rewards  by  observing  which  attacks  the 
attacker  chooses. 


Robustness  to  Objective.  Another  way  to  model  inaccuracies  in  the  defender’s  esti¬ 
mates  of  the  attacker’s  incentives  is  to  assume  the  defender  mistakes  which  of  proht  and 
ROA  actually  matter  to  the  attacker.  The  defense  constructed  by  a  rational  proactive  de¬ 
fender  depends  crucially  on  whether  the  attacker’s  actual  incentives  are  based  on  proht  or 
based  on  ROA,  whereas  the  reactive  algorithm  we  analyze  in  Section  5G  is  robust  to  this 
variation.  In  particular,  consider  the  system  depicted  in  Figure  [5^  and  assume  the  defender 
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Satellite  Office  Internet  Headquarters 


Figure  5.4:  An  attack  graph  that  separates  the 
minimax  strategies  optimizing  ROA  and  attacker 
proht. 


Figure  5.3:  Star-shaped  attack  graph 
with  rewards  concentrated  in  an  un¬ 
known  vertex. 


has  a  budget  of  9.  If  the  defender  believes  the  attacker  is  motivated  by  proht,  the  rational 
proactive  defense  is  to  allocate  the  entire  defense  budget  to  the  right-most  edge  (making  the 
proht  1  on  both  edges).  However,  this  defense  is  disastrous  when  viewed  in  terms  of  ROA 
because  the  ROA  for  the  left  edge  is  inhnite  (as  opposed  to  near  unity  when  the  proactive 
defender  optimizes  for  ROA). 

Catachresis.  The  defense  constructed  by  the  rational  proactive  defender  is  optimized  for 
a  rational  attacker.  If  the  attacker  is  not  perfectly  rational,  there  is  room  for  out-performing 
the  rational  proactive  defense.  There  are  a  number  of  situations  in  which  the  attacker  might 
not  mount  “optimal”  attacks: 

•  The  attacker  might  not  have  complete  knowledge  of  the  attack  graph.  Consider,  for 
example,  a  software  vendor  who  discovers  hve  equally  severe  vulnerabilities  in  one 
of  their  products  via  fuzzing.  According  to  proactive  security,  the  defender  ought  to 
dedicate  equal  resources  to  repairing  these  five  vulnerabilities.  However,  a  reactive 
defender  might  dedicate  more  resources  to  hxing  a  vulnerability  actually  exploited  by 
attackers  in  the  wild.  We  can  model  these  situations  by  making  the  attacker  oblivious 
to  some  edges. 

•  The  attacker  might  not  have  complete  knowledge  of  the  defense  allocation.  For  exam¬ 
ple,  an  attacker  attempting  to  invade  a  corporate  network  might  target  computers  in 
human  resources  without  realizing  that  attacking  the  customer  relationship  manage¬ 
ment  database  in  sales  has  a  higher  return-on-attack  because  the  database  is  lightly 
defended. 
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By  observing  attacks,  the  reactive  strategy  learns  a  defense  tuned  for  the  actual  attacker, 
causing  the  attacker  to  receive  a  lower  ROA. 


5.6  Generalizations 

We  now  consider  several  simple  generalizations  of  our  model  and  results. 


5.6.1  Horn  Clauses 


Thus  far,  we  have  presented  our  results  using  a  graph-based  system  model.  Our  results 
extend,  however,  to  a  more  general  system  model  based  on  Horn  clauses  and  corresponding 
to  hypergraph-based  system  models.  Datalog  programs,  which  are  based  on  Horn  clauses, 
have  been  used  in  previous  work  to  represent  vulnerability-level  attack  graphs  (lOu  et  al. 


2006).  A  Horn  clause  is  a  statement  in  propositional  logic  of  the  form  pi  Ap2  A  ■  ■  ■  Ap„  — )■  g. 
The  propositions  pi,p2,  ■  ■  ■  ,Pn  are  called  the  antecedents,  and  q  is  called  the  consequent.  The 
set  of  antecedents  might  be  empty,  in  which  case  the  clause  simply  asserts  the  consequent. 
Notice  that  Horn  clauses  are  negation-free.  In  some  sense,  a  Horn  clause  represents  an  edge 
in  a  hypergraph  where  multiple  pre-conditions  are  required  before  taking  a  certain  state 
transition. 

In  the  Horn  model,  a  system  consists  of  a  set  of  Horn  clauses,  an  attack  surface  for  each 
clause,  and  a  reward  for  each  proposition.  The  defender  allocates  defense  budget  among 
the  Horn  clauses.  To  mount  an  attack,  the  attacker  selects  a  valid  proof:  an  ordered  list  of 
rules  such  that  each  antecedent  appears  as  a  consequent  of  a  rule  earlier  in  the  list.  For  a 
given  proof  H, 


cost(n,  d)  =  y.  d{c)/w{e) 


cen 


payoff (H)  =  reward(p) 

p6[n] 


where  [H]  is  the  set  of  propositions  proved  by  H  (he.,  those  propositions  that  appear  as 
consequents  in  H).  Profit  and  ROA  are  computed  as  before. 

Our  results  generalize  to  this  model  directly.  Essentially,  we  need  only  replace  each 
instance  of  the  word  “edge”  with  “Horn  clause”  and  “path”  with  “valid  proof.”  For  example, 
the  rows  of  the  matrix  M  used  throughout  the  proof  become  the  Horn  clauses,  and  the 
columns  become  the  valid  proofs  (which  are  numerous,  but  no  matter).  The  entries  of 
the  matrix  become  M(c,  H)  =  l/w{c),  analogous  to  the  graph  case.  The  one  non-obvious 
substitution  is  inc(s),  which  becomes  the  set  of  clauses  that  lack  antecedents. 


5.6.2  Multiple  Attackers 

We  have  focused  on  a  security  game  between  a  single  attacker  and  a  defender.  In  practice, 
a  security  system  might  be  attacked  by  several  uncoordinated  attackers,  each  with  different 
information  and  different  objectives.  Fortunately,  we  can  show  that  a  model  with  multiple 
attackers  is  mathematically  equivalent  to  a  model  with  a  single  attacker  with  a  randomized 
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strategy:  Use  the  set  of  attacks,  one  per  attacker,  to  define  a  distribution  over  edges  where 
the  probability  of  an  edge  is  linearly  proportional  to  the  number  of  attacks  which  use  the 
edge.  This  precludes  the  interpretation  of  an  attack  as  an  s-rooted  path,  but  our  proofs 
do  not  rely  upon  this  interpretation  and  our  results  hold  in  such  a  model  with  appropriate 
modifications. 


5.6.3  Adaptive  Proactive  Defenders 


A  simple  application  of  an  online  learning  result  (Herbster  and  Warmuth,  1998),  modifies 
our  regret  bounds  to  compare  the  reactive  defender  to  an  optimal  proactive  defender  who 
re-allocates  budget  a  fixed  number  of  times.  In  this  model,  our  results  remain  qualitatively 
the  same. 


5.7  Summary 

Many  security  experts  equate  reactive  security  with  myopic  bug-chasing  and  ignore  prin¬ 
cipled  reactive  strategies  when  they  recommend  adopting  a  proactive  approach  to  risk  man¬ 
agement.  In  this  chapter,  we  establish  sufficient  conditions  for  a  learning-based  reactive 
strategy  to  be  competitive  with  the  best  fixed  proactive  defense.  Additionally,  we  show 
that  reactive  defenders  can  out-perform  proactive  defenders  when  the  proactive  defender 
defends  against  attacks  that  never  actually  occur.  Although  our  model  is  an  abstraction 
of  the  complex  interplay  between  attackers  and  defenders,  our  results  support  the  following 
practical  advice  for  CISOs  making  security  investments: 

•  Employ  monitoring  tools  that  let  you  detect  and  analyze  attacks  against  your  enter¬ 
prise.  These  tools  help  focus  your  efforts  on  thwarting  real  attacks. 

•  Make  your  security  organization  more  agile.  For  example,  build  a  rigorous  testing 
lab  that  lets  you  roll  out  security  patches  quickly  once  you  detect  that  attackers  are 
exploiting  these  vulnerabilities. 

•  When  determining  how  to  expend  your  security  budget,  avoid  overreacting  to  the  most 
recent  attack.  Instead,  consider  all  previous  attacks,  but  discount  the  importance  of 
past  attacks  exponentially. 

In  some  situations,  proactive  security  can  out-perform  reactive  security.  For  example,  re¬ 
active  approaches  are  ill-suited  for  defending  against  catastrophic  attacks  because  there  is 
no  “next  round”  in  which  the  defender  can  use  information  learned  from  the  attack.  We 
hope  our  results  will  lead  to  a  productive  discussion  of  the  limitations  of  our  model  and  the 
validity  of  our  conclusions. 

Instead  of  assuming  that  proactive  security  is  always  superior  to  reactive  security,  we 
invite  the  reader  to  consider  when  a  reactive  approach  might  be  appropriate.  For  the  parts 
of  an  enterprise  where  the  defender’s  budget  is  liquid  and  there  are  no  catastrophic  losses, 
a  carefully  constructed  reactive  strategy  can  be  as  effective  as  the  best  proactive  defense  in 
the  worst  case  and  significantly  better  in  the  best  case. 
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Chapter  6 

Learning  to  Find  Leaks  in  Open 
Source  Projects 


A  small  leak  can  sink  a  great  ship. 

-  Benjamin  Franklin 


Many  open-source  projects  land  security  fixes  in  public  repositories  before  shipping  these 
patches  to  users.  In  this  chapter,  we  show  that  an  attacker  who  uses  off-the-shelf  machine 
learning  techniques  can  detect  these  security  patches  using  metadata  about  the  patch  {e.g., 
the  author  of  the  patch,  which  hies  were  modihed,  and  the  size  of  the  modihcation).  By 
analyzing  two  patches  each  day  for  exploitability  over  a  period  of  8  months,  an  attacker 
can  add  148  days  to  the  window  of  vulnerability  for  Firefox  3,  increasing  the  total  window 
of  vulnerability  by  a  factor  of  6.4.  We  argue  that  obfuscating  this  metadata  is  unlikely  to 
prevent  these  information  leaks  because  the  detection  algorithm  aggregates  weak  signals 
from  a  number  of  features.  Instead,  open-source  projects  ought  to  keep  security  patches 
secret  until  they  are  ready  to  be  released. 

Although  the  attacks  of  this  chapter  do  not  target  learners  per  se,  we  can  still  classify 
them  using  the  taxonomy  of  Barreno  et  ah  (2006)  overviewed  in  Section  1.2.2  By  regarding 


the  patches  of  Firefox  together  with  their  labels  in  {‘security’,  ‘non-security’}  as  a  training 
set,  and  the  public  repository  consisting  of  landed  pre-release  patches  as  a  statistic,  our 
attacks  are  Targeted  and  violate  Conhdentiality  as  they  aim  to  determine  the  labels  of 
specihc  patches  by  examining  the  repository.  While  we  model  an  attacker  with  no  control 
over  the  repository,  our  attacks  exploit  a  signihcant  amount  of  information  in  the  form  of 
meta-data  from  the  repository  and  previously  disclosed  labels. 


6.1  Introduction 

Many  important  and  popular  software  development  projects  are  open-source,  including 
Firefox,  Chromium,  Apache,  the  Linux  kernel,  and  OpenSSL.  Software  produced  by  these 
projects  is  run  by  hundreds  of  millions  of  users  and  machines.  Following  the  open-source 
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spirit,  these  projects  make  all  code  changes  immediately  visible  to  the  public  in  open  code 
repositories,  including  landing  hxes  to  security  vulnerabilities  in  public  “trunk”  development 
branches  before  publicly  announcing  the  vulnerability  and  providing  an  updated  version  to 
end  users.  This  common  practice  raises  the  question  of  whether  this  extreme  openness 
increases  the  window  of  vulnerability  by  letting  attackers  discover  vulnerabilities  earlier  in 
the  security  life-cycle.  The  conventional  wisdom  is  that  detecting  these  security  patches 
is  difficult  because  the  patches  are  hidden  among  a  cacophony  of  non-security  changes. 
For  example,  the  central  Firefox  repository  receives,  on  average,  38.6  patches  per  day,  of 
which  0.34  £x  security  vulnerabilities.  Recently,  some  blackhats  in  the  Metasploit  project 
have  used  the  “description”  metadata  held  to  find  Firefox  patches  that  refer  to  non-public 
bug  numbers  (Veditz,  2009).  The  Firefox  developers  have  responded  by  obfuscating  the 


description  held,  but  where  does  this  cat-and-mouse  game  end? 

In  this  chapter,  we  analyze  information  leaks  during  the  Firefox  3  life-cycle  to  answer 
three  key  questions:  (1)  Does  the  metadata  associated  with  patches  in  the  source  code 
repository  contain  information  about  whether  the  patch  is  security  sensitive?  (2)  Using 
this  information,  how  much  less  effort  does  an  attacker  need  to  expend  to  hnd  unannounced 
security  vulnerabilities?  (3)  How  much  do  these  information  leaks  increase  the  total  window 
of  vulnerability? 

To  address  these  questions,  we  apply  off-the-shelf  machine  learning  techniques  to  dis¬ 
criminate  between  security  and  non-security  patches  by  observing  some  intrinsic  metadata 
about  each  patch.  For  example,  we  use  the  patch  author,  the  set  of  hies  modified,  and  the 
size  of  the  modihcations.  We  strengthen  our  conclusions  by  ignoring  the  description  held 
because  we  assume  that  the  developers  can  successfully  obfuscate  the  patch  description. 
Using  standard  machine  learning  techniques,  we  show  that  each  of  these  features  individu¬ 
ally  do  not  contain  much  information  about  whether  patches  are  security  sensitive.  We  hnd 
that  the  patch  author  contains  the  most  information  (followed  by  the  top-level  directory 
containing  the  modihed  hies  and  the  size  of  the  modihcation),  but  even  author  has  a  tiny 
information  gain  ratio  of  0.003. 

We  use  a  support  vector  machine  (SVM)  to  aggregate  the  information  in  these  features, 
but  still  obtain  a  poor  classiher  when  measured  in  terms  of  precision  and  recall.  However,  the 
attacker’s  goal  is  not  to  classify  every  patch  as  security-sensitive  or  non-security-sensitive. 
The  attacker’s  goal  is  to  hnd  at  least  one  vulnerability  to  exploit;  in  particular,  the  attacker 
can  use  the  SVM  to  rank  patches  by  how  conhdent  the  learner  is  in  classifying  the  patch 
security-sensitive.  This  ranking  function  gives  the  attacker  an  informed  way  to  prioritize 
patches  to  examine  when  searching  for  security  patches.  Thus,  we  propose  new  metrics  to 
measure  the  ehectiveness  of  the  detection  algorithm.  In  the  hrst  metric,  attacker  effort,  we 
measure  the  number  of  patches  the  attacker  would  need  to  examine  before  hnding  the  first 
vulnerability  according  to  the  ranked  list  generated  by  the  detection  algorithm.  Using  this 
metric,  we  show  that  even  our  weak  classiher  is  useful  to  the  attacker.  For  example  on  39% 
of  the  days,  the  detection  algorithm  ranks  at  least  one  security  patch  in  the  top  two  patches. 
In  the  second  metric,  increase  to  the  window  of  vulnerability,  we  show  that  an  attacker  who 
examines  the  top  two  patches  ranked  by  the  detection  algorithm  each  day  will  add  an  extra 
148  days  of  vulnerability  to  the  229  day  period  we  study,  representing  a  6.4-fold  increase 
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over  the  window  of  vulnerability  caused  by  the  latency  in  deploying  security  updates. 

Our  results  suggest  that  Firefox  should  change  its  security  life-cycle  to  avoid  leaking 
information  about  unannounced  vulnerabilities  in  its  public  source  code  repositories.  Instead 
of  landing  security  patches  in  the  central  repository,  Firefox  developers  should  land  security 
patches  in  a  private  release  branch  that  is  available  only  to  a  set  of  trusted  testers.  The 
developers  can  then  merge  the  patches  into  the  public  repository  at  the  same  time  they 
release  the  security  update  to  all  users  and  announce  the  vulnerability. 

Although  we  study  Firefox  specifically,  we  believe  our  results  generalize  to  a  number  of 
other  open-source  projects,  including  Chromium,  Apache,  the  Linux  kernel,  and  OpenSSL, 
which  land  vulnerability  fixes  in  public  repositories  before  announcing  the  vulnerability  and 
making  security  updates  available.  However,  we  choose  to  study  these  issues  in  Firefox 
because  Firefox  has  a  state-of-the-art  process  for  responding  to  vulnerability  reports  and 


publishes  the  ground  truth  about  which  patches  fix  security  vulnerabilities  (Mozilla  Foun¬ 


dation,  2010) 


Chapter  Organization.  The  remainder  of  the  chapter  is  organized  as  follows.  Section [C2 
describes  the  existing  Firefox  security  life-cycle.  Section|0  lays  out  the  dataset  we  analyze. 
Section  |6.4|  explains  our  methodology.  Section  |6.5|  presents  our  results.  Section  |6.6|  recom¬ 


mends  a  secure  security  life-cycle.  Section  6T  concludes  the  chapter  with  a  short  summary 
of  the  main  contributions. 


6.2  Life-Cycle  of  a  Vulnerability 

This  section  describes  the  life-cycle  of  a  security  patch  for  the  Firefox  browser.  We  take 
Firefox  as  a  representative  example,  but  many  open-source  projects  use  a  similar  life-cycle. 


6.2.1  Stages  in  the  Life-Cycle 

In  the  Firefox  open-source  project,  vulnerabilities  proceed  through  a  sequence  of  observ¬ 
able  events: 


1.  Bug  filed.  The  Firefox  project  encourages  security  researchers  to  report  vulnerabili¬ 
ties  to  the  project  via  the  project’s  public  bug  tracker.  When  filed,  security  bugs  are 
marked  “private”  (meaning  access  is  restricted  to  a  trusted  set  of  individuals  on  the 
security  team  Veditz  2010  see  Figure  6.1)  and  are  assigned  a  unique  number. 


2.  Patch  landed  in  mozilla-central.  Once  the  developers  determine  the  best  way  to 
£x  the  vulnerability,  a  developer  writes  a  patch  for  the  mainline  “trunk”  of  Firefox 
development.  Other  developers  review  the  patch  for  correctness,  and  once  the  patch 
is  approved,  the  developer  lands  the  patch  in  the  public  mozilla-central  Mercurial 
repository. 
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Figure  6.1:  Information  leaked  about  security-sensitive  bug  numbers  has  been  exploited  by 
attackers  in  the  past  to  identify  undisclosed  vulnerabilities,  when  bug  numbers  were  linked 
to  landed  patches  via  patch  descriptions. 


3.  Patch  landed  in  release  branches.  After  the  patch  successfully  lands  on 
mozilla-central  (including  passing  all  the  automated  regression  and  performance 
tests),  the  developers  merge  the  patch  to  one  or  more  of  the  Firefox  release  branches. 

4.  Security  update  released.  At  some  point,  a  release  driver  decides  to  release  an 
updated  version  of  Firefox  containing  one  or  more  security  hxes  (and  possibly  some 
non-security  related  changes).  These  releases  are  typically  made  from  the  release 
branch,  not  from  the  mozilla-central  repository.  The  current  state  of  the  release 
branch  is  packaged,  signed,  and  made  available  to  users  via  Firefox’s  auto-update 
system. 


Vulnerability  announced.  The  Firefox  developers  announce  the  vulnerabilities 
hxed  in  the  release  (Mozilla  Foundation,  2010).  For  the  majority  of  vulnerabili¬ 
ties,  disclosure  is  simultaneous  with  the  release  of  the  binary.  However  in  some 
cases  disclosure  can  occur  weeks  later  (after  the  security  update  is  applied  by 
most  users). 


5.  Security  update  applied.  Once  a  user’s  auto-update  client  receives  an  updated 
version  of  the  Firefox  binary,  Firefox  updates  itself  and  notihes  the  user  that  a  security 
update  has  been  applied.  Once  the  user  chooses  to  install  the  update,  the  user  is 
protected  from  an  attacker  exploiting  the  vulnerability. 


Previous  work  (Frei  et  ah,  2009)  has  analyzed  the  dynamics  between  steps  (4)  and  (5), 


finding  that  the  user  experience  and  download  size  have  a  dramatic  effect  on  the  time  delay 
and,  hence,  the  window  of  vulnerability.  With  a  sufficiently  slick  update  experience,  browser 
vendors  can  reduce  the  lag  between  (4)  and  (5)  to  a  matter  of  days.  Recent  releases  of  Firefox 
have  an  improved  update  experience  that  reduces  the  window  of  vulnerability  between  steps 
(4)  and  (5). 
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However,  not  as  much  attention  has  been  paid  to  the  dynamics  between  steps  (1)  and 
(4),  likely  because  most  people  make  the  assumption  that  little  or  nothing  is  revealed  about 
a  vulnerability  until  the  vulnerability  is  intentionally  disclosed  in  step  (4).  Unfortunately, 
there  are  a  number  of  information  leaks  in  this  process  that  invalidate  that  assumption. 


6.2.2  Information  Leaks  in  Each  Stage 


Each  stage  in  the  vulnerability  life-cycle  leaks  some  amount  of  information  about  vul¬ 
nerabilities  to  potential  attackers.  For  example,  even  the  first  step  leaks  some  amount  of 
information  because  bug  numbers  are  issued  sequentially  and  an  attacker  can  brute  force 
bug  numbers  to  determine  which  are  “forbidden”  and  hence  represent  security  vulnerabili¬ 
ties.  Of  course,  simply  knowing  that  a  vulnerability  was  reported  to  Firefox  does  not  give 
the  attacker  much  useful  information  for  creating  an  exploit. 

The  developers  leak  more  information  when  they  land  security  patches  in 
mozilla-central  because  the  mozilla-central  is  a  public  repository.  It  is  unclear,  a  pri¬ 
ori,  whether  an  attacker  will  be  able  to  find  security  patches  landing  in  mozilla-central 
because  these  security  patches  are  landed  amid  a  “thundering  herd”  of  other  patches  (see 


Figure  6.2),  but  if  an  attacker  can  detect  that  a  patch  fixes  a  security  vulnerability,  the 


attacker  can  learn  information  about  the  vulnerability.  For  example,  the  attacker  learns 
where  in  the  code  base  the  vulnerability  exists.  If  the  patch  fixes  a  vulnerability  by  adding 
a  bounds  check,  the  attacker  can  look  for  program  inputs  that  generate  large  buffers  of  the 
checked  type.  In  this  work,  we  do  not  evaluate  the  difficulty  of  reverse  engineering  an  exploit 


from  a  vulnerability  fix,  but  there  has  been  some  previous  work  (Brumley  et  ah,  2008)  on 


reverse  engineering  exploits  from  binary  patches  (which  is,  of  course,  more  difficult  than 
reverse  engineering  exploits  from  source  patches). 


6.3  Analysis  Goals  and  Setup 

In  this  section,  we  describe  the  dataset  and  the  success  metrics  for  the  detection  algo¬ 
rithm. 


6.3.1  Dataset 


Set  of  Patches.  In  our  experiment,  we  considered  the  complete  life-cycle  of  Firefox  3, 
which  lasted  over  12  months,  contained  14,416  non-security  patches,  125  security  patches, 
and  12  security  updates.  In  particular,  we  use  publicly  available  data  starting  from  the 
release  of  Firefox  3  and  ending  with  the  release  of  Firefox  3.5.  Also,  to  strengthen  our 
results,  we  focus  on  the  mozilla-central  repository,  which  receives  the  vast  majority 
of  Firefox  development  effort.  We  cloned  the  entire  mozilla-central  repository  to  our 
experimental  machines  to  identify  all  patches  during  the  life-cycle  of  Firefox  3.  We  ignore 
the  release  branches  to  evaluate  how  well  our  detection  algorithm  is  able  to  find  security  fixes 
amid  mainline  development  (see  Figure  6.2).  Even  though  we  initially  limit  our  attention 
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Figure  6.2:  Attackers  must  find  security  patches  within  a  “thundering  herd”  of  non-security 
patches. 


to  Firefox  3,  we  repeat  our  results  on  Firefox  3.5  in  Section  |6.5.4|  and  expect  our  results 
generalize  to  other  releases  of  Firefox  and,  more  generally,  to  other  open-source  projects. 


Ground  Truth.  We  determined  the  “ground  truth”  of  whether  a  patch  hxes  a  security  vul¬ 


nerability  by  examining  the  list  of  known  vulnerabilities  published  by  Firefox  (Mozilla  Foun¬ 


dation,  2010).  Each  CVE  listed  on  the  known  vulnerability  web  page  contains  a  link  to  one 
or  more  entries  in  the  Firefox  bug  database.  At  the  time  we  crawled  these  bug  entries 
(after  disclosure),  the  bug  entries  were  public  and  contained  links  to  the  Mercurial  commits 
that  hxed  the  vulnerabilities  (both  in  mozilla-central  and  in  the  release  branches).  Our 
crawler  harvested  these  links  and  extracted  the  unique  identiher  for  each  patch. 

The  known  vulnerability  page  dates  each  vulnerability  disclosure,  and  we  assume  that 
these  disclosure  dates  are  accurate.  Each  bug  entry  is  timestamped  with  its  creation  date 
and  every  message  on  the  bug  thread  is  dated  as  well.  Finally,  the  mozilla-central  pushing 
website  contains  the  date  and  time  of  every  change  in  the  “pushing,”  which  we  also  assume 
is  authoritative. 


6.3.2  Success  Metrics 

Given  a  dataset  of  past  patches,  an  attacker  can  label  the  patches  based  on  whether 
these  patches  have  been  announced  as  vulnerability  hxes.  Using  these  labels,  the  attacker 
can  train  a  statistical  machine  learning  algorithm  to  predict  whether  a  current  patch  hxes 
an  (unannounced)  vulnerability.  Using  this  machine  learning  algorithm,  the  attacker  can 
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classify  a  new  patch  into  one  of  two  classes:  secnrity-sensitive  and  non-secnrity-sensitive. 
However,  the  attacker’s  goal  is  actnally  not  to  classify  each  patch  correctly.  For  example, 
the  attacker  does  not  care  if  a  detection  algorithm  has  a  high  false  negative  rate  (often  incor¬ 
rectly  classihes  secnrity  vnlnerabilities  as  non-vnlnerability  patches)  if  the  algorithm  reliably 
hnds  at  least  one  secnrity  vulnerability — the  attacker  need  only  exploit  one  vulnerability  to 
compromise  users’  computers. 

Instead,  the  attacker’s  goal  is  to  build  a  detection  algorithm  that  makes  it  easier  to  hnd  at 
least  one  (unannounced)  vulnerability.  In  particular,  we  consider  detection  algorithms  that 
output  a  real- valued  conhdence  for  their  prediction  about  whether  a  given  patch  is  a  security 
patch.  The  attacker  can  use  this  conhdence  value  to  rank  a  set  of  patches,  and  then  examine 
the  patches  in  rank  order.  In  this  way,  the  detection  algorithm  prioritizes  which  patches  the 
attacker  should  examine  for  exploitability.  The  usefulness  of  the  detection  algorithm,  then, 
lies  in  how  much  effort  the  detector  saves  the  attacker  by  giving  real  security  patches  higher 
priorities  than  non-security  patches.  By  reducing  the  amount  of  effort  the  attacker  must 
expend  to  hnd  a  vulnerability,  the  attacker  can  hnd  vulnerabilities  earlier,  and  increase  the 
window  of  vulnerability.  In  particular,  we  formalize  these  notions  into  the  following  two 
success  metrics. 


6.3. 2.1  Cost  of  Vulnerability  Discovery:  Attacker  Effort 

Given  a  set  of  patches  and  a  ranking  function,  we  call  the  rank  of  the  hrst  true  security 
patch  the  attacker  effort.  This  quantity  rehects  the  number  of  patches  the  attacker  has  to 
examine  when  searching  the  ranked  list  before  hnding  the  hrst  patch  that  hxes  a  security 
vulnerability.  For  example,  if  the  third-highest  ranked  patch  actually  hxes  a  security  vul¬ 
nerability,  then  the  attacker  needs  to  examine  three  patches  before  hnding  the  vulnerability, 
resulting  in  an  attacker  ehort  of  three.  Using  this  metric,  we  can  compute  the  percent  of 
days  on  which  an  attacker  who  expends  a  given  ehort  will  be  able  to  hnd  a  security  patch. 


6. 3. 2. 2  Benefit  for  the  Attacker:  Window  of  Vulnerability 


Another  metric  we  propose  is  the  increase  to  the  window  of  vulnerability  due  to  the 
assistance  of  the  detection  algorithm.  In  particular,  an  attacker  who  discovers  a  vulnerability 
d  days  before  the  next  security  update  increases  the  total  window  of  vulnerability  for  Firefox 
users  by  d  days.  (Notice  that  knowing  of  multiple  vulnerabilities  simultaneously  does  not 
increase  the  aggregate  window  of  vulnerability  because  knowing  multiple  vulnerabilities 
simultaneously  is  redundant.) 


Previous  work  ( Duebendorfer  and  Frei ,  2009 )  explores  the  ehectiveness  of  browser  update 


mechanisms,  hnding  that  security  updates  take  some  amount  of  time  to  propagate  to  users. 
In  particular,  they  measured  the  cumulative  distribution  of  the  number  of  days  users  take  to 
update  their  browsers  after  security  updates  (Duebendorfer  and  Frei,  2009,  Figure  3).  After 


about  10  days,  the  penetration  growth  rate  rapidly  decreases,  asymptotically  approaching 
about  80%.  By  integrating  the  area  above  the  CDF  up  to  80%,  we  can  estimate  the  expected 
number  of  days  a  user  takes  to  update  Firefox  3  conditioned  that  they  are  in  the  hrst  80% 
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who  update.  We  estimate  this  quantity,  the  post-release  window  of  vulnerability^  to  be  3.4 
days,  which  we  use  as  a  baseline  value  for  comparing  windows  of  vulnerability. 


6.3.3  Baseline:  The  Random  Ranker 

To  quantify  the  benefit  of  using  machine  learning,  we  compare  our  detection  algorithm 
with  a  random  ranker  who  examines  available  patches  in  a  random  order.  This  straw-man 
algorithm  has  two  key  properties:  (1)  our  attacker  cost  and  benefit  metrics  are  easy  to 
measure  for  the  random  ranker  (as  we  detail  below);  (2)  perhaps  more  importantly,  the 
random  ranker  models  a  real-world  attacker  who  has  access  to  mozilla-central  but  does 
not  have  any  reason  to  examine  landed  patches  in  any  particular  order. 

As  shown  in  the  next  two  sections  both  expected  cost  and  benefit  metrics  can  be  com¬ 
puted  exactly  and  efficiently  for  the  random  ranker,  given  the  total  number  of  patches  and 
the  number  of  security  patches. 


6.3.4  Deriving  Random  Ranker  Expected  Effort 

We  model  the  random  ranker  as  iteratively  selecting  patches  one-at-a-time,  uniformly- 
at-random  from  the  pool  of  patches  available  in  mozilla-central,  without  replacement. 
This  attacker’s  effort  is  the  random  number  X  of  patches  the  attacker  must  examine  up  to 
and  including  the  first  patch  drawn  that  fixes  a  vulnerability.  We  summarize  the  cost  of 
using  unassisted  random  ranking  via  the  expected  attacker  effort,  to  be 
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where  n  G  N  is  the  total  number  of  patches  available  in  the  pool,  and  1  <  <  n  is  the 

number  of  these  patches  that  fix  vulnerabilities.  We  now  derive  Equation  (6.1). 

If  the  ranker’s  sampling  were  performed  with  replacement,  then  the  distribution  of  at¬ 
tacker  effort  X  would  be  geometric  with  known  expectation.  Without  replacement,  if  there 
are  n  patches  in  the  pool,  of  which  fix  vulnerabilities,  X  has  probability  mass 
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for  X  G  {1,... ,77,  —  Us -1-1}  and  zero  otherwise.  The  second  equality  follows  from  some 
simple  algebra.  The  first  equality  is  derived  as  follows.  The  probability  of  the  first  draw 
being  a  non-security  patch  is  the  number  of  non-security  patches  over  the  number  of  patches 
or  (n  —  ns)/n.  Conditioned  on  the  first  patch  not  fixing  a  vulnerability,  the  second  draw 
has  probability  (n  —  —  l)/(n  —  1)  of  being  non-security  related  since  one  fewer  patch 

is  in  the  pool  (which  is,  in  particular  a  non-security  patch).  This  process  continues  with 


143 


the  draw  having  (conditional)  probability  [n  —  Ug  —  k  +  1)  / {n  —  k  +  1)  of  being  a  non¬ 
security  patch.  After  drawing  k  non-security  patches,  the  probability  of  selecting  a  patch 
that  hxes  a  vulnerability  is  Ug/ {n  —  k).  Equation  (6.2)  follows  by  chaining  these  conditional 
probabilities. 

With  X’s  probability  mass  in  hand,  the  expectation  can  be  efficiently  computed  for  any 
moderate  {n,ng)  pair  by  summing  Equation  (6.3). 


6.3.5  Deriving  Random  Ranker  Expected  Vulnerability  Window 
Increase 

We  begin  by  constructing  the  distribution  of  the  hrst  day  an  undisclosed  vulnerability 
hx  is  found  after  a  security  update,  when  the  random  ranker  is  constrained  to  a  budget  b 
of  patches  daily,  and  never  re-examines  patches.  Let  rit  and  rit^g  denote  the  number  of  new 
patches  and  new  vulnerability  hxes  landed  on  day  t  G  N.  Let  random  variable  X”  be  the 
attacker  effort  required  to  hnd  one  of  Ug  vulnerability  hxes  out  of  a  pool  of  n  patches  as 
described  above.  Finally,  let  At  be  the  event  that  the  hrst  vulnerability  hx  is  found  on  day 
f  G  N.  Then  trivially 


Pr(A)  =  <*)  . 

Now  we  may  condition  on  -lAi  to  express  the  probability  of  A2  occurring:  if  Ai  does  not 
occur  then  b  non-security-related  patches  are  removed  from  the  pool,  so  that  the  pool  consists 
of  rii^g  +  rii^g  vulnerability  hxes  and  ni  +  n2  —  b  patches  total.  The  conditional  probability 
of  A2  given  -1A2  is  then  the  probability  of  <  b}.  By  induction  we  can  continue 

to  exploit  this  conditional  independence  to  yield  for  all  f  >  0 
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The  RHS  of  this  expression  is  easily  calculated  by  summing  Equation  (6.3)  over  x  E  [6]. 
The  unconditional  probability  distribution  now  follows  from  the  mutual  exclusivity  of  the 
At  and  the  chain  rule  of  probability 
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Thus  we  need  only  compute  the  expression  in  Equation  (6.4)  once  for  each  t  G  [X], 
where  N  is  the  number  of  days  until  the  next  security  update.  From  these  conditional 
probabilities  we  can  efficiently  calculate  the  unconditional  Pr  (A*)  for  each  t  G  [X].  Noting 
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Figure  6.3:  The  distribution  of  the  random 
ranker’s  effort  X  as  a  function  of  n,  for  n  = 


100,  as  given  by  Equation  (6.3). 


Figure  6.4:  The  random  ranker’s  ex¬ 
pected  effort  E  [X]  as  a  function  of  Ug  for 
n  G  {20,40,60,80,100},  as  given  by  Equa¬ 
tion  (6.1 ). 


that  At  implies  an  increase  oiY  =  A^  —  f-|-lto  the  window  of  vulnerability,  the  expected 
increase  is 

N 

E[F]  =  ^(Ar-t-M)Pr(kli)  .  (6.5) 

t=i 

Remark  55.  Notice  that  there  can  be  a  non-trivial  probability  that  no  vulnerability  fix  will  be 
found  by  the  random  ranker  in  the  N  day  period.  This  probability  is  simply  1  —  ■ 

On  typical  inter-update  periods  this  probability  can  be  higher  than  0.5  for  budgets  ~  1.  This 
fact  serves  to  reduce  the  expected  increase  to  the  window  of  vulnerability,  particularly  for 
small  budgets. 

Remark  56.  The  astute  reader  will  notice  that  we  removed  b  non-security-related  patches 
from  the  pool  on  all  days  we  do  not  find  a  vulnerability  fix,  irrespective  of  whether  b  or  more 
such  patches  are  present.  We  have  assumed  that  n  is  large  for  simplicity  of  exposition.  Once 
n  drops  to  Ug  +  h  or  lower,  we  remove  all  non-security-related  patches  upon  failing  to  find 
a  vulnerability  fix.  On  the  next  day,  the  probability  of  finding  a  vulnerability  fix  is  unity. 
The  probabilities  of  finding  vulnerability  fixes  on  subseguent  days  are  thus  zero.  Thus  as 
b  increases,  the  distribution  becomes  more  and  more  concentrated  at  the  start  of  the  inter¬ 
update  period  as  we  would  expect.  Finally,  if  no  vulnerability  fixes  are  present  in  the  pool  on 
a  particular  day,  then  the  probability  of  finding  such  a  patch  is  trivially  zero. 
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Effect  of  Increasing  Pool  Size  with  Constant  Fraction 


Random  Ranker:  Vulnerability  Window  vs.  Effort 


Number  of  patches 


Patches  attacker  is  willing  to  examine  daily  (log  scale) 


Figure  6.5:  The  random  ranker’s  expected 
effort  E  [X]  as  a  function  of  n  for  con¬ 
stant  fractions  of  security  patches  ng/n  G 
{0.0032,0.01,0.032,0.1,0.32},  as  given  by 
Equation  (6.1). 


Figure  6.6:  The  random  ranker’s  expected 
vulnerability  window  increase  vs.  daily  bud¬ 
get,  for  a  31  day  cycle  with  39  patches  daily 
(the  Firefox  3  averages).  Benefits  shown  for 
security  patch  fractions  of  Figure  6.5 


6.3. 5.1  Understanding  the  Random  Ranker  Metrics 


The  probability  mass  and  expectation  of  X  are  explored  in  Figures  6.3  and  6.4  For 


Ug  =  1  the  distribution  of  effort  is  uniform;  and  as  the  number  of  security  patches  increases 
under  a  constant  pool  size,  mass  quickly  concentrates  on  lower  effort  (note  that  in  each 
figure  attacker  effort  is  depicted  on  a  log  scale).  Similarly  the  significant  effect  of  varying 


Hg  on  the  expected  effort  can  be  seen  in  Figure  6.4 


For  constant  fractions  of  patches  that  fix  a  security  vulnerability,  the  expected  effort  to 
find  a  security  patch  and  the  expected  vulnerability  window  increase  for  a  typical  Firefox 
3  inter-point  release  cycle  are  shown  in  Figures  6.5  and  6.6  In  both  cases  effort  is  shown 


in  a  log  scale.  The  former  figure  shows  that  under  a  growing  pool  of  patches  with  constant 
fraction  being  security-related,  the  attacker  effort  for  finding  a  security  patch  is  not  constant 
but  in  fact  increases  as  the  pool  expands.  For  a  typical  cycle  (of  length  31  days),  typical 
patch  landing  rate  (of  39  patches  daily)  and  fixed  fraction  of  landed  patches  that  are  security- 


related,  Figure  6.6  shows  the  expected  window  increase  as  a  function  of  the  daily  budget. 


Again  we  see  a  great  difference  over  increasing  proportions  of  security  patches,  and  the 
effect  of  the  proportion  of  the  dependence  of  benefit  on  budget.  Finally  notice  that  the 
average  fraction  of  security-related  patches  for  Firefox  3  is  0.0085,  and  so  the  corresponding 
curves  at  10“^  should  approximate  the  performance  of  the  random  ranker  for  Firefox  3  as  is 


verified  in  Section  6.5  The  utility  of  including  curves  at  atypical  security  patch  rates  (from 
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the  perspective  of  Firefox)  is  to  preview  the  cost  and  beneht  achieved  by  the  random  ranker 
as  applied  to  other  open-sonrce  projects. 


6.4  Methodology 

In  this  section,  we  describe  the  methodology  we  nse  to  analyze  information  leaks  in  the 
secnrity  life-cycle.  We  hrst  describe  the  featnres  we  observe  and  then  ontline  onr  approach 
to  bnilding  a  detection  algorithm. 


6.4.1  Features  Used  By  the  Detector 

There  are  a  nnmber  of  featnres  we  conld  nse  to  identify  secnrity  patches.  Blackhats 
in  the  Metasploit  project  have  nsed  the  “description”  metadata  to  determine  whether  a 
patch  hxes  a  secnrity  vnlnerability.  Firefox  patches  typically  reference  the  bng  nnmber  that 
they  £x  in  their  descriptions.  By  attempting  to  access  the  indicated  bng,  an  attacker  can 
determine  whether  the  patch  references  a  secnrity-sensitive  bng  (see  Fignre[6T|.  When  the 
Firefox  developers  became  aware  of  Metasploit’s  actions,  they  took  steps  to  obfnscate  the 


patch  description  (Veditz,  2009). 


Obfuscating  and  de-obfuscating  the  patch  description  is  clearly  a  cat-and-mouse  game. 
Instead  of  analyzing  leaks  in  the  patch  description,  we  strengthen  our  conclusions  by  as¬ 
suming  that  the  Firefox  developers  are  able  to  perfectly  obfuscate  the  patch  description. 
Instead  of  the  description,  we  analyze  information  leaks  in  other  metadata  associated  with 
the  patch  (see  Figure  6.7). 


Author.  We  hypothesize  that  information  about  the  patch  author  (the  developer 
who  wrote  the  patch)  will  leak  a  sizable  amount  of  information  because  Firefox  has 


a  security  team  (Veditz,  2010)  that  is  responsible  for  hxing  security  vulnerabilities. 


Most  members  of  the  Firefox  community  do  not  have  access  to  security  bugs  and  are 
unlikely  to  write  security  patches. 


•  Top-level  directory.  For  each  hie  that  was  modified  by  the  patch,  we  observed  the 
top-level  directory  in  the  repository  that  contained  the  hie.  In  the  Firefox  directory 
structure,  the  top-level  directory  roughly  corresponds  to  the  module  containing  the 
hie.  If  a  patch  touches  more  than  one  top-level  directory,  we  picked  the  directory  that 
contains  the  most  modified  hies. 


•  File  type.  For  each  hie  that  was  modihed  by  the  patch,  we  observe  the  hle’s  extension 
to  impute  the  type  of  the  hie.  For  example,  patches  to  Firefox  often  modify  C-I--I- 
implementation  hies,  interface  description  hies,  and  XML  user  interface  descriptions. 
If  a  patch  touches  more  than  one  type  of  hie,  we  pick  the  hie  type  with  the  most 
modihed  hies. 

•  Patch  size.  We  observe  a  number  of  size  metrics  for  each  patch,  including  the  total 
size  of  the  dih  in  characters,  the  number  of  lines  in  the  dih,  the  number  of  hies  in 
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Figure  6.7:  An  example  patch  from  the  Firefox  Mercurial  repository.  In  addition  to  patch 
description  and  bug  number,  several  features  leak  information  about  the  security-related 
nature  of  a  patch. 


the  diff,  and  the  average  size  of  all  modihed  files.  Although  these  features  are  highly 
correlated,  the  SVM’s  ability  to  model  non-linear  patterns  lets  us  take  advantage  of 
all  these  features. 

•  Temporal.  The  timestamp  for  each  patch  reveals  the  time  of  day  and  the  day  of  week 
the  patch  was  landed  in  the  mozilla-central  repository.  We  include  these  features 
in  case,  for  example,  some  developers  prefer  to  land  security  hxes  at  night  or  on  the 
weekends. 

We  presented  nominal  features  (author,  top-level  directory,  and  hie  type)  to  the  SVM  as 
binary  vectors.  For  example,  the  author  out  of  N  developers  in  the  Firefox  project  is 
represented  as  A^  —  1  zeros  and  a  single  1  in  the  position. 

Although  these  features  are  harder  to  obfuscate  than  the  free-form  description  held,  we 
do  not  claim  that  these  features  cannot  be  obfuscated.  Instead,  we  claim  that  there  are  a 
large  number  of  small  information  leaks  that  can  be  combined  to  detect  security  patches. 
Of  course,  this  set  of  features  is  far  from  exhaustive  and  serves  only  to  lower  bound  the 
attacker’s  abilities. 
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6.4.2  Detection  Approach 


Algorithm.  For  our  detection  algorithm,  we  use  the  popular  libsvm  library  for  sup¬ 


port  vector  machine  (SVM)  learning  (Chang  and  Lin,  2001).  Although  we  could  improve 
our  metrics  by  tuning  the  learning  algorithm,  we  choose  to  use  the  default  conhguration 
to  strengthen  our  conclusions — extracting  basic  features  (as  detailed  above)  and  running 
libsvm  in  its  default  configuration  requires  only  basic  knowledge  of  Python  and  no  exper¬ 
tise  in  machine  learning. 

Support  vector  machines  perform  supervised  binary  classihcation  by  learning  a  maximum- 


margin  hyperplane  in  a  high- dimensional  feature  space  (Burges,  1998  Cristianini  and  Shawe- 


Taylor,  2000;  Scholkopf  and  Smola,  2001).  Many  feature  mappings  are  possible,  and  the 


default  libsvm  conhguration  uses  the  feature  mapping  induced  by  the  Radial  Basis  Func¬ 
tion  kernel,  which  takes  a  parameter  7  that  controls  kernel  width.  The  SVM  takes  an¬ 
other  parameter  C,  which  controls  regularization.  An  attacker  need  not  know  how  to 
set  these  parameters  because  libsvm  chooses  the  parameters  that  optimize  5- fold  cross- 
validation  estimates  over  a  grid  of  (C,  7)  pairs.  The  optimizing  pair  is  then  used  to  train 
the  hnal  SVM  model.  We  enable  a  feature  of  libsvm  that  learns  posterior  probability  es¬ 
timates  Pr  (patch  hxes  a  vulnerability  |  patch)  rather  than  security/non-security  class  pre¬ 


dictions  (Lin  et  ah ,  2007).  We  refer  to  these  posterior  probabilities  as  probabilities  or  scores. 


After  training  an  SVM  on  patches  labeled  as  security  or  non-security,  we  can  use  the 
SVM  to  rank  a  set  of  previously  unseen  patches  by  ordering  the  patches  in  decreasing  order 
of  score.  If  the  SVM  is  given  sufficient  training  data,  we  expect  the  higher-ranked  patches 
to  be  more  likely  to  £x  vulnerabilities.  As  we  show  in  Section  6.5,  even  though  the  SVM 
scores  are  unsuitable  for  classification  they  are  an  effective  means  for  ranking  patches. 

Note  that  detecting  patches  that  repair  vulnerabilities  can  be  cast  as  learning  problems 
other  than  scalar-valued  supervised  classification.  For  example,  we  could  take  a  more  direct 
approach  via  ranking  or  ordinal  regression  (although  these  again  do  not  directly  optimize 
our  primary  interest:  having  one  security  patch  ranked  high).  However,  we  use  an  SVM 
because  it  balances  statistical  performance  for  learning  highly  non-linear  decision  rules  and 
availability  of  off-the-shelf  software  appropriate  for  data  mining  novices. 


Online  learning.  To  limit  the  detector  to  information  available  to  real  attackers,  we 
perform  the  following  simulation  using  the  dates  collected  in  our  data  set.  For  each  day, 
starting  on  the  day  Firefox  3  was  released  and  ending  on  the  day  Firefox  3.5  was  released, 
we  perform  the  following  steps: 


1.  We  train  a  fresh  SVM  on  all  the  patches  landed  in  the  repository  between  the  day  Fire¬ 
fox  3  was  released  and  the  most  recent  security  update  before  the  current  day,  labeling 


each  patch  according  to  the  publicly  known  vulnerabilities  list  (Mozilla  Foundation 

2010Dn 


^Note  that  not  all  security  patches  are  disclosed  as  fixing  vulnerabilities  by  the  following  release.  Such 
patches  are  necessarily  (mis)labeled  as  non-security,  and  trained  on  as  such.  Once  the  true  patch  is  disclosed, 
we  re-label  and  re-train.  The  net  effect  of  delayed  disclosure  is  a  slight  degradation  to  the  SML-assisted 
ranker’s  performance. 
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2.  We  then  use  the  trained  SVM  to  rank  all  the  patches  landed  since  the  most  recent 
security  update. 


After  running  the  complete  online  simulation,  we  observe  the  highest  ranking  received  by  a 
real  vulnerability  £x  on  each  day.  This  ranking  corresponds  to  the  SVM-assisted  attacker 
effort  for  that  day — the  number  of  patches  an  attacker  using  the  SVM  would  need  to  analyze 
before  encountering  a  security  patch.  For  each  day  we  also  compute  the  expected  unassisted 
attacker  effort  as  represented  by  Equation  (6.1),  using  the  size  of  that  day’s  available  pool 
of  patches  and  number  of  security  patches. 


6.5  Results 

We  now  present  the  results  of  searching  for  security  patches  in  mozilla-central.  We 
hrst  explore  the  discriminative  power  of  the  individual  features  and  then  compare  an  SVM- 
assisted  attacker  to  an  unassisted  attacker  using  a  random  patch  ranking. 


6.5.1  Feature  Analysis 


Analyzing  Discriminative  Power  of  Individnal  Featnres.  Prior  to  computing  the 
SVM-assisted  attacker  effort,  we  analyzed  the  ability  of  individual  features  to  discriminate 
between  security  and  non-security  patches.  We  adopt  the  information  theoretic  information 
gain  ratio,  which  reflects  the  decrease  in  entropy  of  the  sequence  of  training  set  class  labels 
when  split  by  each  individual  feature.  Before  running  the  feature  analysis  on  Firefox  data, 
we  briefly  overview  background  on  the  information  gain  ratio. 

The  information  theoretic  quantity  known  as  the  information  gain  measures  how  well 
a  feature  separates  a  set  of  training  data,  and  is  popular  in  information  retrieval  and  in 


machine  learning  within  the  IDS  and  C4.5  decision  tree  learning  algorithms  (Mitchell,  1997). 


For  training  set  S  and  nominal  feature  F  taking  discrete  values  in  Xp,  the  information 
gain  is  dehned  as 


Gain(S',  F)  =  Entropy(S'f ) 


E 

x£Xp 


1^1 


-  Entropy 


(6.6) 


where  Si  denotes  the  multiset  of  F’s  example  binary  labels,  Si^x  denotes  the  subset  of  these 
labels  for  examples  with  feature  F  value  x,  and  for  multiset  T  taking  possible  values  in 
X  we  have  the  usual  dehnition  of  Entropy (T)  =  ^  The  hrst  term  of 

the  information  gain,  the  entropy  of  the  training  set,  corresponds  to  the  impurity  of  the 
examples’  labels.  A  pure  set  with  only  one  repeated  label  has  zero  entropy,  while  a  set 
having  half  positive  examples  and  half  negative  examples  has  a  maximum  entropy  of  one. 
The  information  gain’s  second  term  corresponds  to  the  expected  entropy  of  the  training 
set  conditioned  on  the  value  of  feature  F.  Thus  a  feature  having  a  high  information  gain 
corresponds  to  a  large  drop  in  entropy,  meaning  that  splitting  on  that  feature  resulting  in 
a  partition  of  the  training  set  into  subsets  of  like  labels.  A  low  (necessarily  non-negative) 
information  gain  corresponds  to  a  feature  that  is  not  predictive  of  class  label. 
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Analysis  of  Individual  Features'  Discriminative  Power 


o 

CO 


Feature 


Figure  6.8:  The  features  ordered  by  decreasing  ability  to  discriminate  between  security  and 
non-security  patches,  as  represented  by  the  information  gain  ratio. 


Two  issues  require  modihcation  of  the  basic  information  gain  before  use 


m 


practice  (Mitchell,  1997).  The  hrst  is  that  nominal  features  F  with  large  numbers  of  discrete 


values  \Xf\  tend  to  have  artihcially  inflated  information  gains  (being  as  high  as  log2  |TV|) 
since  splitting  on  such  features  can  lead  to  numerous  small  partitions  of  the  training  set  with 
trivially  pure  labels.  An  example  is  the  author  feature,  which  has  close  to  500  values.  In  such 
cases  it  is  common  practice  to  correct  for  this  artihcial  inflation  by  using  the  information 


gain  ratio  (Quinlan,  1986)  as  dehned  below.  We  use  Sp  to  denote  the  multiset  of  examples’ 


feature  F  values  in  the  ratio’s  denominator,  which  is  known  as  the  split  information. 


GainRatio(S',  F)  = 


Gain  (S',  F) 
Entropy(S'i?) 


(6.7) 


The  second  issue  comes  from  taking  the  idea  of  many-valued  nominal  features  to  the 
extreme:  continuous  features  such  as  the  diff  length  (of  which  there  are  7,572  unique  values 
out  of  14,541  patches  in  our  dataset)  and  the  hie  size  (which  enjoys  12,795  unique  values) 
are  analyzed  by  forming  a  virtual  binary  feature  for  each  possible  threshold  on  the  feature. 
The  information  gain  (ratio)  of  a  continuous  feature  is  dehned  as  the  maximum  information 


gain  (ratio)  of  any  induced  virtual  binary  feature  (Fayyad,  1992). 


The  results  of  our  feature  analysis  are  presented  in  Figures  6.8-6.11  The  individual 


features’  abilities  to  discriminate  between  non-security  and  security  patches,  as  measured 
by  the  information  gain  ratio,  are  recorded  in  Figure  6^  For  the  nominal  features — author, 
top-level  directory,  hie  type,  and  day  of  week — we  compute  the  information  gain  ratios 
directly,  whereas  for  the  continuous  features — dih  length,  number  of  lines  in  dih,  hie  size, 
number  of  hies,  and  time  of  day — we  use  the  information  gain  ratio  given  by  choosing  the 
best  threshold  value.  The  author  feature  has  the  most  discriminative  power,  providing  an 
information  gain  ratio  1.8  times  larger  than  the  next  most  informative  feature.  The  next 
two  most  discriminative  features  are  top-level  directory  and  dih  length,  which  enjoy  similar 
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Figure  6.9:  The  authors  who 
committed  security  patches, 
ordered  by  proportion  of 
patches  that  are  security 
patches.  Top  four  authors  are 
identihed  with  their  security 
and  total  patches  along-side. 


Figure  6.10:  The  top-level 
directories  ordered  by  the 
proportion  of  patches  that 
are  security  patches  The  top 
four  directories  are  identihed 
with  their  security  and  total 
patches  along-side. 


Figure  6.11:  The  CDFs  of 
the  security  and  non-security 
diff  lengths.  The  hgure  is 
“zoomed”  in  on  the  left,  with 
the  top  1,000  largest  lengths 
not  shown. 


gain  ratios.  The  remainder  of  the  patch  size  features  and  the  hie  type  have  smaller  ratios, 
and  the  two  temporal  features  individually  contribute  signihcantly  less  information. 

Furthermore,  we  show  that  each  individual  feature  alone  does  not  provide  signihcantly 
high  discriminative  power.  This  observation  follows  from  the  maximum  information  gain 
ratio  (belonging  to  author)  being  tiny:  3  x  10“^.  To  add  credence  to  these  numbers,  we 
note  also  that  the  unnormalized  information  gains  have  a  similar  ordering  with  the  author 
feature  coming  out  on  top  with  an  information  gain  of  2  x  10“^,  which  corresponds  to  a 
small  change  in  entropy.  To  summarize,  we  oher  the  following  remark. 

Remark  57.  Some  features  provide  discriminative  power  for  separating  security  patches 
from  non-security  patches,  with  author,  top-level  directory,  and  diff  length  among  the  most 
discriminative.  However,  individually,  no  feature  provides  significant  discriminative  power 
for  separating  security  patches  from  non-security  patches. 


Analysis  of  Discriminating  Features.  To  give  intuition  why  some  features  provide 
discriminating  power  for  security  vs.  non-security  patches,  we  analyze  in  more  detail  the 
three  most  discriminative  features:  the  author,  top-level  directory,  and  dih  length. 
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For  the  author  and  top-level  directory  features,  we  analyze  their  influential  feature  values. 
For  each  occurring  feature  value,  we  compute  its  proportion  value:  the  number  of  patches 
with  that  feature  value  and  are  security  sensitive  divided  by  the  total  number  of  patches 


with  that  feature  value.  Figures  6^  and  6.10  depict  the  influential  feature  values  for  the 
authors  and  top-level  directory  features  by  ranking  the  feature  values  by  their  proportion 
values.  During  the  life-cycle  of  Firefox  3,  a  total  of  516  authors  contributed  patches  out 
of  which  38  contributed  at  least  one  security  patch.  In  Figure  |6.9[  we  show  only  the  38 
authors  who  wrote  at  least  one  security  patch  and  omit  the  remaining  478  authors  who  did 
not  write  any  security  patches.  Notice  that  the  top  four  authors  (labeled)  are  all  members 


of  the  Mozilla  Security  Group  (Veditz,  2010). 


To  explore  the  third  most  individually  discriminative  feature,  the  continuous  diff  length. 

When  all  diff 


we  plot  the  feature  CDFs  for  security  and  non-security  patches  in  Figure  6.11 


lengths  are  displayed,  the  CDFs  resemble  step- functions  because  the  diff  length  distribution 
is  extremely  tail  heavy.  Figure  |6.11|  zooms  in  on  the  left  portion  of  the  CDF  by  plotting 


the  curves  for  the  hrst  6,  500  (out  of  7,  500)  unique  diff  lengths.  From  the  relative  positions 
of  the  CDFs,  we  observe  that  security  patches  have  shorter  diff  lengths  than  non-security 
patches.  This  matches  our  expectations  that  patches  that  add  features  to  Firefox  require 
larger  diffs. 

Although  our  feature  analysis  identihes  features  that  are  individually  more  discriminative 
than  others,  the  analysis  also  shows  that  no  individual  feature  effectively  predicts  whether 
a  patch  is  security-related.  This  observation  demonstrates  that  motivated  attackers  should 
look  to  more  sophisticated  statistical  techniques  (such  as  an  SVM)  to  reduce  the  effort 
required  to  hud  security  patches  and  suggests  that  obfuscating  individual  features  will  not 
plug  information  leaks  in  the  open-source  life-cycle.  Moreover  certain  features  cannot  be 
effectively  obfuscated.  For  example,  the  Mozilla  Committer’s  Agreement  would  be  violated 
if  developer  names  were  redacted  from  mozilla-central. 


6.5.2  Classifier  Performance 


Figure  6.12  depicts  the  time  series  of  scores  assigned  to  each  patch  by  the  SVM  (the 


horizontal  axis  shows  the  day  when  each  patch  is  landed  in  the  repository  starting  at  2008- 
09-24;  at  which  point  security  patches  were  hrst  announced).  Note  that  as  mentioned  in 


Section  6.4.2  we  use  an  online  learning  approach,  so  the  score  assigned  to  a  patch  is  only 
computed  using  the  SVM  trained  with  labeled  patches  seen  up  to  the  most  recent  security 
update  before  the  day  the  patch  is  landed  in  the  repository  (and  including  any  out-of-release 
delayed  disclosures  occurring  before  the  patch  is  landed).  Notice  that  the  scores  for  security 
and  non-security  patches  in  the  hrst  50  days  are  quite  similar.  Over  time,  the  SVM  learns 
to  assign  high  scores  to  a  handful  of  vulnerability  hxes  (and  a  few  non-security  patches)  and 
low  scores  to  a  handful  of  non-security  patches.  However,  many  patches  are  assigned  very 
similar  scores  of  around  0.01  irrespective  of  whether  they  hx  vulnerabilities. 

Viewed  as  a  binary  classiher,  the  SVM  performs  quite  poorly  because  there  is  no  sharp 
threshold  that  divides  security  patches  from  non-security  patches.  However,  when  viewed 
in  terms  of  the  attacker’s  utility,  the  SVM  might  still  be  useful  in  reducing  ehort  because 
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Figure  6.12:  The  time  series  of  SVM  probability  estimates,  with  security  patches  and  non¬ 
security  patches  delineated  by  color. 


the  relative  rankings  of  the  vulnerability  fixes  are  generally  higher  than  most  non-security 
patches.  We  make  this  notion  precise  in  the  next  subsection. 

Remark  58.  When  used  as  a  classifier,  the  SVM  classifier  performs  poorly.  However, 
attacker  effort  depends  only  on  the  relative  patch  rankings  assigned  by  the  detector  and  is 
not  necessarily  affected  by  poor  absolute  predictive  performance. 


6.5.3  Cost-Benefit  of  SVM- Assisted  Vulnerability  Discovery 

6. 5. 3.1  Attacker  Effort 


Figure  6.13  shows  the  time  series  of  the  effort  the  attacker  expends  to  hnd  a  vulner¬ 
ability  (as  measured  by  the  number  of  patches  the  attacker  examines),  as  described  in 
Section  6.3.2.1  The  attacker  effort  measured  for  a  given  day  is  computed  to  reflect  the  fol¬ 
lowing  estimate.  Imagine,  for  example,  an  attacker  who  “wakes  up”  on  a  given  day,  trains 
an  SVM  on  publicly  available  information  (including  all  labeled  patches  before  the  current 
day),  and  then  starts  looking  for  security  patches  among  all  the  (unlabeled)  patches  landed 
in  the  repository  since  the  most  recent  security  update,  in  rank  order  provided  by  the  SVM. 
Then  the  attacker  effort  measured  for  a  given  day  is  the  number  of  patches  that  the  attacker 
has  to  examine  before  hnding  a  security  patch  using  the  rank  order  provided  by  the  SVM. 

Each  continuous  segment  in  the  graph  corresponds  to  one  of  the  12  security  updates 
during  our  study.  For  a  period  of  time  after  each  release,  there  are  no  security  patches 
in  mozilla-central,  which  is  represented  on  the  graph  as  a  gap  between  the  segments. 


154 


Attacker  Effort  Time  Series 


Time  (days  after  2008-09-24) 


Figure  6.13:  The  attacker  effort  (number  of  patches  to  check  before  hnding  a  vulnerability) 
of  the  SVM,  and  the  expected  attacker  effort  of  the  random  ranker,  as  a  function  of  time. 


For  the  hrst  50  days  of  the  experiment  both  the  random  ranker  and  the  SVM-assisted 
attacker  expend  relatively  large  amounts  of  effort  to  hnd  security  patches.  This  poor  initial 
performance  of  the  SVM,  also  observed  in  Figure  6.12,  is  due  to  insufficient  training.  The 
SVM,  like  any  statistical  estimator,  requires  enough  data  with  which  to  generalize. 


Remark  59.  During  the  latter  2/3^'^^  of  the  year  (the  8  month  period  starting  50  days  after 
2008-09-24)  the  SVM-assisted  attacker,  now  with  enough  training  data,  regularly  expends 
significantly  less  effort  than  an  attacker  who  examines  patches  in  a  random  order. 


Note.  Given  the  “warm-up”  effects  of  the  hrst  50  days  when  the  SVM  has  insufficient 
training  data,  all  non-time-series  hgures  in  the  sequel  are  shown  using  the  data  after  2008- 
11-13. 

The  general  cyclic  trends  of  the  SVM-assisted  and  random  rankers  are  also  noteworthy. 
In  most  inter-update  periods,  the  random  ranker  enjoys  a  relatively  low  attacker  effort 
(though  higher  than  the  SVM’s)  which  quickly  increases.  The  reason  for  this  behavior 
can  be  understood  by  plotting  the  expected  effort  for  the  random  ranker  with  respect  to 


the  number  of  security  patches  for  various  total  patch  pool  sizes  as  shown  in  Figure  6.4 


Immediately  after  the  landing  of  a  hrst  post-update  security  patch,  the  pool  of  available 
patches  gets  swamped  by  non-security  patches  (c/.  Figure  [G^,  corresponding  to  increasing 
n  in  Figure  6.4  and  greatly  increasing  the  expectation.  Further  landings  of  security  patches 
are  few  and  far  between  (by  virtue  of  the  rarity  of  such  patches),  and  so  moving  across  the 
hgure  with  increasing  Ug  is  rare.  As  the  periods  progress,  non-security  patches  continue  to 
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CDFs  of  Attacker  Efforts  -  From  2008-11-13 


Vulnerability  Window  vs.  Attacker  Effort 
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Attacker  effort  (log  scale) 


Patches  attacker  is  willing  to  examine  daily  (log  scale) 


Figure  6.14:  The  cumulative  distribution 
functions  of  the  attacker  effort  displayed 
in  Figure  6.13,  from  11/13/2008  onwards. 
CDFs  are  shown  for  both  the  SVM  and  the 
random  ranker. 


Figure  6.15:  The  total  increase  to  the  vul¬ 
nerability  window  size  throughout  the  year, 
for  a  given  level  of  daily  attacker  effort 
with  or  without  SVM  assistance.  Results 
trimmed  to  11/13/2008  onwards. 


swamp  security  patches.  This  trend  for  the  random  ranker’s  expected  effort  is  more  directly 
seen  in  Figure  6.5[  which  plots  expected  effort  over  an  prototypical  cycle  of  Firefox  3.  Over 
the  single  31  day  cycle,  39  patches  land  daily  of  which  a  constant  proportion  are  security 
patches.  The  curve  for  10“^  most  closely  represents  Firefox  3  where  the  security  patch  rate 
is  0.0085  of  the  total  patch  rate.  The  trend  observed  empirically  in  Figure  6.13  matches 
both  the  overall  shape  and  location  of  the  predicted  trend. 

At  times  early  on  in  the  inter-release  periods,  the  SVM-assisted  attacker  experiences 
the  same  upward  trending  effort,  but  eventually  the  developers  land  a  security  patch  that 
resembles  the  training  data.  Given  just  one  “easy”  £x,  the  effort  required  of  the  SVM- 
assisted  attacker  plummets.  In  two  cycles  (and  partially  in  two  others)  the  SVM-assisted 
attacker  must  expend  more  effort  than  the  random  ranker.  This  is  due  to  a  combination 
of  factors  including  the  small  rates  of  landing  security  patches  which  means  that  the  only 
nn released  security  patches  may  not  resemble  previous  training  data. 


6. 5. 3. 2  Proportion  of  Days  of  Successful  Vulnerability  Discovery 


ure 


Figure  6.14  depicts  the  cumulative  distribution  function  (CDF)  of  attacker  effort  (Fig- 


6.13),  showing  how  often  the  SVM-assisted  and  random  ranker  can  hnd  a  security  patch 
as  a  function  of  effort.  Note  that  the  CDFs  asymptote  to  0.90  rather  than  1.0  because 
mozilla-central  did  not  contain  any  security  patches  during  10%  of  the  8  month  period. 
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Remark  60.  The  SVM-assisted  attacker  discovers  a  security  patch  with  the  first  examined 
patch  for  34%  of  the  8  month  period.  Moreover  by  examining  only  2  patches,  the  SVM- 
assisted  attacker  can  find  security  patches  over  39%  of  the  period. 

If  the  unassisted  attacker  expends  the  minimum  effort  of  18.5,  it  can  only  hnd  security 
patches  for  less  than  0.5%  of  the  8  month  period.  By  contrast,  an  SVM-assisted  attacker 
who  examines  17  patches  will  hnd  a  security  patch  during  44%  of  the  period.  In  order  to 
hnd  security  patches  for  22%  of  the  8  month  period,  the  random  ranker  must  examine  on 
average  up  to  70.3  patches.  The  SVM-assisted  attacker  achieves  significantly  greater  benefit 
than  an  attacker  who  examines  patches  in  random  order,  when  small  to  moderate  numbers 
of  patches  are  examined  (i.e.,  up  to  100  patches). 

When  examining  100  or  more  patches,  the  SVM-assisted  and  random  rankers  hnd  secu¬ 
rity  patches  for  similar  proportions  of  the  8  month  period,  with  the  random  ranker  achieving 
slightly  better  performance. 


6. 5. 3. 3  Total  Increase  in  the  Window  of  Vulnerabilities 

Although  the  CDF  of  attacker  ehort  measures  how  hard  the  attacker  must  work  in 
order  to  hnd  a  patch  that  hxes  a  vulnerability,  the  CDF  does  not  measure  how  valuable 
that  vulnerability  is  to  an  attacker.  In  Figure  [6.15  we  estimate  the  value  of  discovering  a 
vulnerability  by  measuring  the  total  increase  in  the  window  of  vulnerability  gained  by  an 
attacker  who  expends  a  given  amount  of  ehort  each  day  (as  described  in  Section  6.3.2.2). 
Note  that  this  dihers  from  the  previous  section  by  considering  an  attacker  who  aggregates 
work  over  multiple  days,  and  who  does  not  re-examine  patches  from  day-to-day. 


Remark  61.  At  1  or  2  patches  examined  daily  over  the  8  month  period,  the  SVM-assisted 
attacker  increases  the  window  of  vulnerability  by  89  or  148  days  total,  respectively.  By 
contrast  the  random  ranker  must  examine  3  or  7  patches  a  day  (roughly  3  times  the  work)  to 
achieve  the  approximate  same  benefit.  At  small  budgets  of  1  or  2  patches  daily,  the  random 
ranker  achieves  window  increases  of  AY  or  82  days  which  are  just  over  half  the  SVM-assisted 
attacker’s  benefits.  At  higher  daily  budgets  of  7  patches  or  more,  the  two  attackers  achieve 
very  similar  benefits  with  the  random  ranker’s  being  slightly  (insignificantly)  greater. 


Compared  to  the  Firefox  3  base-line  vulnerability  window  size  of  3.4  days  (see  Sec¬ 
tion  6.3.2.2),  the  increases  to  window  size  of  89  and  148  represent  multiplicative  increases 
by  factors  of  3.9  and  6.4  respectively. 


6. 5. 3. 4  In  Search  of  Severe  Vulnerabilities 


Thus  far,  we  have  treated  all  vulnerabilities  equally.  In  reality,  attackers  prefer  to  ex¬ 
ploit  higher  severity  vulnerabilities  because  those  vulnerabilities  let  the  attacker  gain  more 
control  over  the  user’s  system.  To  evaluate  how  well  the  attacker  fairs  at  hnding  severe 
vulnerabilities- 


-those  labeled  as  either  “high' 


”  or  “critical” 


in  impact  (Adamski,  2009) — we 


measure  the  attacker  ehort  required  to  hnd  the  hrst  high  or  critical  vulnerability  (that  is, 
we  ignore  “low”  and  “moderate”  vulnerabilities).  Note  that  we  did  not  re-train  the  SVM 


157 


Attacker  Effort  Time  Series  -  Severe  Vulnerabiiity  Discovery 
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Figure  6.16:  The  time  series  of  SVM-assisted 
(high  or  critical  level)  vulnerabilities. 


CDFs  of  Attacker  Efforts  (Severe  Vulnerabilities) 


Attacker  effort  (log  scale) 

Figure  6.17:  The  CDFs  of  the  SVM-assisted 
and  random  ranker  efforts  for  discover¬ 
ing  severe  vulnerabilities,  from  11/13/2008 
onwards. 


and  random  ranker  effort  for  hnding  severe 


Severe  Vulnerability  Window  vs.  Attacker  Effort 


Patches  attacker  is  willing  to  examine  daily  (log  scale) 

Figure  6.18:  Total  increase  to  the  vulnera¬ 
bility  window  for  hnding  severe  vulnerabili¬ 
ties  given  levels  of  daily  attacker  effort,  from 
11/13/2008  onwards. 


on  severe  vulnerabilities  even  though  re-training  could  lead  to  better  results  for  the  special 
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case  of  discovering  high-severity  vulnerabilities.  Figures  |6.16j-|6.18]  present  our  results  for 
hnding  severe  vulnerability  fixes.  The  attacker  effort  time  series  for  the  SVM-assisted  and 
random  ranker  is  displayed  in  Figure  [6.16[  Overall,  attacker  effort  curves  are  similar  for  all 


vulnerabilities,  just  shifted  upwards  away  from  1  during  several  inter-update  periods. 

We  can  interpret  the  effect  of  focusing  on  severe  vulnerabilities  by  examining  the  attacker 


effort  CDFs  in  Figure  6.17  Although  both  attackers  asymptote  to  the  lower  proportion  of 
the  period  containing  severe  vulnerability  fixes  (down  from  90%  for  identifying  arbitrary 
vulnerabilities  to  86%),  only  the  random  ranker’s  CDF  is  otherwise  relatively  unchanged. 
The  random  ranker’s  minimum  effort  has  increased  from  18.5  to  20.3  patches  with  a  sim¬ 
ilarly  low  probability.  The  SVM-assisted  attacker  CDF  undergoes  a  more  drastic  change. 
Examining  one  patch  results  in  a  vulnerability  for  14%  of  the  8  month  period,  whereas  an 
effort  of  6  and  21  produce  vulnerabilities  for  20%  and  34%  of  the  8  month  period,  respec¬ 
tively.  To  achieve  these  three  proportions  the  random  ranker  must  examine  48,  52,  and  76 
patches  respectively. 

Remark  62.  The  SVM-assisted  attacker  is  still  able  to  outperform  the  random  ranker  in 
finding  severe  vulnerabilities,  in  particular  finding  such  security  fixes  20%  of  the  time  by 
examining  6  patches. 

The  increases  to  the  severe  vulnerability  window  are  shown  for  the  two  attackers  in 
Figure  6.18  Again,  we  see  a  shift,  with  the  SVM-assisted  attacker  continuing  to  outperform 


the  random  ranker  on  small  budgets  (except  for  a  budget  of  1  patch)  or  otherwise  perform 
similarly. 

Remark  63.  By  examining  2  patches  daily  during  the  8  month  period,  the  SVM-assisted 
attacker  increases  the  vulnerability  window  by  131  days.  By  contrast  the  random  ranker  with 
budget  2  achieves  an  expected  window  increase  of  72  days. 


6. 5. 3. 5  When  One  is  Not  Enough:  Finding  Multiple  Vulnerabilities 

An  attacker  searching  for  security  patches  might  suffer  from  false  negatives:  the  attacker 
might  mistakenly  take  a  security  patch  as  a  non-security  patch.  In  practice,  an  attacker 
may  wish  to  examine  more  patches  than  represented  by  the  attacker  effort  defined  above. 
To  model  this  situation,  we  considered  the  problem  of  finding  2  or  3  security  patches  instead 
of  just  one. 


As  depicted  in  F igures [6 . 1 9jj6 . 2 Ij  hnding  1,  2,  or  3  security  patches  requires  progressively 
more  effort.  When  computing  the  increase  to  the  window  of  vulnerabilities  in  Figure  6.21 
we  assume  that  the  attacker’s  analysis  of  the  examined  patches  only  turns  up  the  1®*,  2°^ 
and  3'’’’*  security  hxes  respectively.  To  hud  2  or  3  security  patches  over  34%  of  the  8  month 
period,  the  SVM-assisted  attacker  must  examine  35  or  36  patches  respectively. 

Finally  consider  approximating  the  window  of  vulnerability  achieved  by  an  attacker  ex¬ 
amining  a  single  patch  daily  with  no  false  negatives.  Examining  3  patches  a  day  increases 
the  total  vulnerability  window  by  83  days  even  if  the  attacker’s  analysis  produces  one  false 
negative  each  day.  Assuming  two  false  negatives  each  day,  examining  4  patches  daily  in¬ 
creases  the  window  by  80  days  total.  Similarly  increasing  the  window  by  151  or  148  days. 
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Attacker  Effort  Time  Series  -  1st,  2nd,  3rd  Vulnerabilities 


Figure  6.19:  The  time  series  of  SVM-assisted  ranker  effort  for  finding  1,  2  or  3  vulnerabilities. 

CDFs  of  Attacker  Efforts  -  1st,  2nd,  3rd  Vulnerabilities  Vulnerability  Window  vs.  Effort  -  1-3  Vulnerabilities 


Attacker  effort  (log  scale) 

Figure  6.20:  The  CDFs  of  the  SVM-assisted 
efforts  for  discovering  1,  2  or  3  vulnerabili¬ 
ties,  from  11/13/2008  onwards. 


Patches  attacker  is  willing  to  examine  daily  (log  scale) 


Figure  6.21:  Total  increase  to  the  vulnera¬ 
bility  window  for  hnding  1,  2  or  3  vulnera¬ 
bilities  given  levels  of  daily  attacker  effort, 
from  11/13/2008  onwards. 


approximating  the  error-free  result  under  a  two  patch  per  day  budget,  requires  examining 
12  or  18  patches  daily  when  suffering  one  or  two  false  negatives  respectively. 
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6.5.4  Repeatability  of  Results  Over  Independent  Periods  of  Time 


In  the  above  sections  we  explore  how  an  attacker  can  hnd  vulnerabilities  over  the  lifetime 
of  a  major  release  of  a  large  open-source  project.  It  is  natural  to  ask:  how  repeatable  are 
these  results  over  subsequent  releases?  As  a  first  step  towards  answering  this  question,  we 
repeat  our  analysis  on  the  complete  life-cycle  of  Firefox  3.5. 

In  order  to  test  the  hypothesis  of  repeatability  of  our  results  for  the  SVM-assisted  and 
random  rankers  on  other  releases  of  Firefox,  we  repeated  our  analysis  of  Firefox  3,  on 
Firefox  3.5.  We  again  focus  on  mozilla-central,  cloning  the  entire  repository  to  identify 
patches  landed  during  the  Firefox  3.5  life-cycle.  Firefox  3.5  was  released  June  30,  2009  and 
remained  active  until  the  release  of  Firefox  3.6  on  January  21,  2010.  During  this  6  month 
period  7  minor  releases  to  Firefox  were  made  ending  with  Firefox  3.5.7  on  January  5,  2010. 
We  consider  patches  landed  between  the  release  of  Firefox  3.5.7  and  Firefox  3.6,  whose 
identities  as  security  patches  or  non-security  patches  were  disclosed  February  17,  2010  upon 
the  release  of  Firefox  3.5.8.  During  the  6  month  period,  7,033  patches  were  landed  of  which 
54  fixed  vulnerabilities. 

While  the  Firefox  3.5  patch  volumes  correspond  to  roughly  half  those  of  the  year-long 
period  of  active  development  on  Firefox  3,  it  is  possible  that  the  patches’  metadata  may 
have  changed  subtly,  resulting  in  significant  differences  in  SVM-assisted  ranker  performance. 
Changes  to  contributing  authors,  functions  of  top-level  directories,  diff  sizes  or  other  side- 
effects  of  changes  to  coding  policies,  time  of  day  or  day  of  week  when  patches  tend  to  be 
landed,  could  each  contribute  to  changes  to  the  attacker’s  performance.  Given  the  similar 
rates  of  patch  landings,  one  can  expect  the  random  ranker’s  performance  to  be  generally 
comparable  to  the  Firefox  3  results. 

Figures  6.22  6.24  depict  the  cost-benefit  analysis  of  the  SVM-assisted  and  random 
rankers  searching  for  vulnerabilities  in  Firefox  3.5.  It  is  immediately  clear  that  the  same 
kind  of  performance  is  enjoyed  by  the  attackers  as  achieved  for  Firefox  3,  if  not  slightly 
better. The  CDFs  of  attacker  effort  displayed  in  Figure  6.23  show  that  while  the  random 
ranker’s  performance  is  roughly  the  same  as  before,  the  SVM-assisted  ranker’s  performance 
at  very  low  effort  (1  or  2  patches)  is  inferior  compared  to  Firefox  3,  while  the  assisted 
attacker  enjoys  much  better  performance  at  low  to  moderate  efforts. 


Remark  64.  The  SVM-assisted  attacker  discovers  a  security  patch  in  Firefox  3.5  by  the 
third  patch  examined,  for  22%  of  the  5.5  month  period;  by  the  20*^  patch  the  SVM-assisted 
attacker  finds  a  security  patch  for  50%  of  the  period.  By  contrast  the  random  ranker  must 
examine  69.1  or  95  patches  in  expectation  to  find  a  security  patch  for  these  proportions  of 
the  5. 5  month  period. 


In  a  similar  vein,  the  increase  to  the  window  of  vulnerability  achieved  by  the  random 
ranker  is  comparable  between  Firefox  3  and  3.5  (correcting  for  the  differences  in  release 
lifetimes),  while  the  SVM-assisted  attacker  achieves  superior  performance  (c/.  Figure [6.24[). 

Remark  65.  By  examining  one  or  two  patches  daily,  the  SVM-assisted  ranker  increases 
the  window  of  vulnerability  (in  aggregate)  by  91  or  113  days  total  (representing  increases  to 
the  base  vulnerability  window  of  factors  of  5.8  and  6.7  respectively) .  By  contrast  the  random 
ranker  achieves  increases  of  25.1  or  43.8  days  total  under  the  same  budgets. 
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Figure  6.22:  SVM-assisted  and  random  ranker  efforts  for  finding  Firefox  3.5  vulnerabilities. 
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Figure  6.23:  The  CDFs  of  the  SVM-assisted 
and  random  ranker  attacker  efforts,  for  Fire¬ 
fox  3.5. 


Figure  6.24:  Increase  to  the  total  window  of 
vulnerability  achieved  for  varying  levels  of 
daily  attacker  effort,  for  Firefox  3.5. 


We  may  conclude  from  these  results  that  the  presented  attacks  on  Firefox  3  are  repeatable 
for  Firefox  3.5,  and  we  expect  our  analysis  to  extend  to  other  major  releases  of  Firefox  and 
major  open-source  projects  other  than  Firefox. 
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CDFs  of  Attacker  Efforts  -  Feature  Removal 


Vulnerability  Window  vs.  Effort  -  Feature  Removal 


Attacker  effort  (log  scale) 
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Figure  6.25:  The  effect  of  removing  individ¬ 
ual  features  on  the  SVM-assisted  attacker 
effort  CDFs,  from  11/13/2008  onwards. 


Figure  6.26:  The  effect  of  removing  features 
on  the  SVM-assisted  increase  to  the  vulner¬ 
ability  window,  from  11/13/2008  onwards. 


6.5.5  Feature  Analysis  Redux:  the  Effect  of  Obfuscation 

In  Section  |6.5.1|  we  perform  a  hlter-based  feature  analysis  for  discriminating  between 
security  patches  and  non-security  patches.  In  this  section  we  ask:  what  is  the  effect  of 
obfuscating  individual  features?  We  answer  this  question  through  a  wrapper-based  feature 
analysis  in  which  we  perform  the  same  simulation  of  an  SVM-assisted  ranker  as  above,  but 
now  with  one  feature  removed. 

Figure  |6.25|  depicts  the  attacker  effort  CDFs  for  the  SVM-assisted  ranker  when  trained 
with  all  features,  and  trained  with  either  the  author,  top  directory,  file  type,  time  of  day, 
day  of  week,  or  the  set  of  diff  size  features  removed.  We  remove  the  number  of  characters  in 
the  diff,  number  of  lines  in  the  diff,  number  of  files  in  the  diff,  and  file  size  simultaneously, 
since  we  observed  no  difference  when  only  one  of  these  features  was  removed.  A  plausible 
explanation  for  this  invariance  would  be  high  correlation  among  these  features.  Removing 
the  author  feature  has  the  most  negative  impact  on  the  attacker  effort  CDF,  reducing  the 
proportion  by  0.048  on  average  over  attacker  efforts  in  [1,315].  That  is,  on  average  over 
attacker  efforts  for  5%  of  the  8  month  period  a  security  patch  is  found  by  the  SVM-assisted 
ranker  trained  with  the  author  feature  while  the  attacker  without  access  to  patch  author 
information  find  no  security  patch;  and  while  the  effect  is  most  significant  over  attacker 
efforts  in  [1,10],  the  performance  of  the  SVM-assisted  ranker  without  the  author  feature 
is  still  strong  in  this  range.  Removing  the  file  type,  time  of  day,  diff  size,  day  of  week,  or 
top  directory  have  increasingly  positive  impacts  on  the  overall  attacker  effort  CDF.  Despite 
libsvm’s  use  of  cross-validation  for  tuning  the  SVM’s  parameters,  the  positive  improvements 
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point  to  overfitting  which  could  be  a  product  of  the  high  dimensionality  of  the  learning 
problem  together  with  a  very  small  sample  of  security  patches:  as  noted  above,  our  goal  is 
merely  to  lower  bound  the  performance  of  an  attacker  assisted  by  machine  learning. 

The  increase  to  the  window  of  vulnerability  achieved  by  an  SVM-assisted  attacker  with¬ 


out  access  to  individual  features  (or  the  group  of  diff  size  features)  is  shown  in  Figure  6.26 


For  some  attacker  efforts  the  increase  is  less  without  certain  features,  but  overall  we  see  a 
more  positive  effect.  The  least  positive  effect  is  observed  when  removing  the  author  feature: 
the  increase  in  window  size  is  only  5  days  more  on  average  over  attacker  efforts  in  [1,  57] 
than  when  the  author  feature  is  included. 

We  thus  draw  the  following  conclusion,  which  agrees  with  the  hlter-based  feature  analysis 
presented  in  Section  6.5. 1[ 


Remark  66.  Obfuscating  the  patch  author  has  the  greatest  negative  impact  on  the  SVM- 
assisted  ranker’s  performance,  relative  to  obfuscating  other  features  individually.  However 
the  magnitude  of  impact  is  negligible. 


As  noted  above,  even  if  the  impact  of  obfuscating  patch  authors  were  greater,  doing  so 
would  violate  the  Mozilla  Committer’s  Agreement. 


6.6  Improving  the  Security  Life-Cycle 

In  this  section,  we  explore  ways  in  which  open-source  projects  can  avoid  information 
leaks  in  their  security  life-cycle.  Instead  of  attempting  to  obfuscate  the  features  an  attacker 
could  use  to  hnd  security  patches,  we  recommend  that  the  developers  land  vulnerability  hxes 
in  a  “private”  repository  and  use  a  set  of  trusted  testers  to  ensure  the  quality  of  releases. 

6.6.1  Workflow 

A  natural  reaction  to  our  experiment  is  to  attempt  to  plug  the  information  leaks  by 
obfuscating  patches.  However,  we  argue  that  this  approach  does  not  scale  well  enough 
to  prevent  a  sophisticated  attacker  from  detecting  security  patches  before  announcement 
because  an  attacker  can  use  standard  machine  learning  techniques  to  aggregate  information 
from  a  number  of  weak  indicators.  In  general,  it  is  difficult  to  predict  how  such  a  “cat- 
and-mouse”  game  would  play  out,  but,  in  this  case,  the  attacker  appears  to  have  signihcant 
advantage  over  the  defender. 

Instead  of  trying  to  plug  each  information  leak  individually,  we  recommend  re-organizing 
the  vulnerability  life-cycle  to  prevent  information  about  vulnerabilities  from  flowing  to  the 
public  (regardless  of  how  well  the  information  is  obfuscated).  Instead  of  landing  security 
patches  in  the  public  mozilla-central  repository  first,  we  propose  landing  them  in  a  private 
release  branch.  This  release  branch  can  then  be  made  public  (and  the  security  patches 
merged  into  the  public  repository)  on  the  day  the  patch  is  deployed  to  users.  This  workflow 
reverses  the  usual  integration  path  by  merging  security  fixes  from  the  release  branch  to 
mozilla-central  instead  of  from  mozilla-central  to  the  release  branch. 
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6.6.2  Quality  Assurance 

The  main  cost  of  landing  security  patches  later  is  that  the  patches  receive  less  testing 
before  release.  When  the  Firefox  developers  land  security  patches  in  mozilla-central, 
those  patches  are  tested  by  a  large  number  of  users  who  run  nightly  builds  of  Firefox.  If  a 
security  patch  causes  a  regression  (for  example,  a  crash),  these  users  can  report  the  issue 
to  the  Firefox  developers  before  the  patch  is  deployed  to  all  users.  The  Firefox  developers 
can  then  iterate  on  the  patch  and  improve  the  quality  of  security  updates  (thereby  making 
it  less  costly  for  users  to  apply  security  updates  as  soon  as  they  are  available). 

Instead  of  having  the  public  at  large  test  security  updates  prior  to  release,  we  recommend 
that  testing  be  limited  to  a  set  of  trusted  testers.  Ideally,  this  set  of  trusted  testers  would 
be  vetted  by  members  of  the  security  team  and  potentially  sign  a  non-disclosure  agreement 
regarding  the  contents  of  security  updates.  The  size  of  the  trusted  tester  pool  is  a  trade-off 
between  test  coverage  and  the  ease  with  which  an  attacker  can  inhltrate  the  program,  which 
is  a  risk  management  decision. 


6.6.3  Residual  Risks 


There  are  two  residual  risks  with  this  approach.  First,  the  bug  report  itself  still  leaks 
some  amount  of  information  because  the  bug  is  assigned  a  sequential  bug  number  that  the 
attacker  can  probe  to  determine  when  a  security  bug  was  hied.  This  information  leak  seems 
fairly  innocuous.  Second,  the  process  leaks  information  about  security  hxes  on  the  day  the 
patch  becomes  available.  This  leak  is  problematic  because  not  all  users  are  updated  instan¬ 
taneously  (Duebendorfer  and  Frei,  2009).  However,  disclosing  the  source  code  contained 
in  each  release  is  required  by  many  open-source  licenses.  As  a  practical  matter,  source 
patches  are  easier  to  analyze  than  binary-only  patches,  but  attackers  can  reverse  engineer 


vulnerabilities  from  binaries  alone  ( 

Brumley  et  ah 

2008 

).  One  way  to  mitigate  this  risk  is 

to  update  all  users  as  quickly  as  possible  ( 

Duebendorfer  and  Frei 

2009 

)• 

6.7  Summary 

Landing  security  patches  in  public  source  code  repositories  signihcantly  increases  the 
window  of  vulnerability  of  open  source  projects.  Even  though  security  patches  are  landed 
amid  a  cacophony  of  non-security  patches,  we  show  that  an  attacker  can  use  off-the-shelf 
machine  learning  techniques  to  rank  patches  based  on  intrinsic  metadata.  Our  results  show 
that  a  handful  of  features  are  sufficient  (in  aggregate)  to  reduce  the  number  of  non-security 
patches  an  attacker  need  examine  before  encountering  a  security  patch.  For  22%  of  the 
period  we  study,  the  highest  ranked  patch  actually  hxes  a  security  vulnerability.  Because 
our  algorithm  establishes  only  a  lower  bound  on  attacker  efficacy,  it  is  likely  that  real 
attackers  will  be  able  to  perform  even  better  by  considering  more  features  or  using  more 
sophisticated  detection  algorithms. 

A  natural  reaction  to  these  hndings  is  to  obfuscate  more  features  in  an  attempt  to 
make  the  security  patches  harder  to  detect.  However,  our  analysis  shows  that  no  single 
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feature  contains  much  information  about  whether  a  patch  hxes  a  vulnerability.  Instead,  the 
detection  algorithm  aggregates  information  from  a  number  of  weak  signals  to  rank  patches, 
suggesting  that  obfuscation  is  a  losing  battle.  Instead  of  obfuscating  patch  metadata,  we 
recommend  changing  the  secnrity  life-cycle  of  open-source  projects  to  avoid  landing  secnrity 
hxes  in  pnblic  repositories.  We  snggest  landing  these  hxes  in  private  repositories  and  having 
a  pool  of  trusted  testers  test  security  updates  (rather  than  the  pnblic  at  large). 

Onr  recommendations  rednce  the  openness  of  open-sonrce  projects  by  withholding  some 
patches  from  the  commnnity  nntil  the  project  is  ready  to  release  those  patches  to  end 
users.  However,  open-source  projects  already  recognize  the  need  to  withhold  some  security- 
sensitive  information  from  the  commnnity  (as  evidenced  by  these  projects  limiting  access  to 
security  bugs  to  a  vetted  security  group).  In  a  broad  view,  limiting  access  to  the  security 
patches  themselves  prior  to  release  is  a  small  price  to  pay  to  signihcantly  reduce  the  window 
of  vulnerability. 
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Chapter  7 

Conclusions  and  Open  Problems 


As  for  the  future,  your  task  is  not  to  foresee  it,  but  to  enable  it. 

-  Antoine  de  Saint-Exupery 


Machine  Learning,  Statistics  and  Security  stand  to  gain  much  from  their  cross-disciplinary 
research.  On  the  one  hand,  many  real-world  security-sensitive  systems  are  now  using  Ma¬ 
chine  Learning,  opening  up  the  possibility  of  new  vulnerabilities  due  to  attacks  on  Machine 
Learning  algorithms  themselves.  Understanding  the  effects  of  various  kinds  of  attacks  on 
learners  and  designing  algorithms  for  learning  in  adversarial  environments  are  important 
hrst  steps  before  users  will  trust  ‘black-box’  adaptive  systems.  On  the  other  hand,  many 
defenses  and  even  attacks  on  non-adaptive  systems,  can  greatly  beneht  by  leveraging  the 
ability  of  Statistical  Machine  Learning  based  approaches  to  model  both  malicious  and  benign 
patterns  in  data. 

This  dissertation’s  main  contributions  he  in  this  intersection  of  Machine  Learning,  Statis¬ 
tics  and  Security.  Viewing  Machine  Learning  under  a  lens  of  Security  and  Privacy,  Part 
explores  three  kinds  of  attacks  on  adaptive  systems  in  which  an  adversary  can  manipulate  a 
learner  by  poisoning  its  training  data,  submit  queries  to  a  previously-trained  learner  in  order 
to  evade  detection,  or  try  to  infer  information  about  a  learner’s  privacy-sensitive  training 
data  by  observing  models  trained  on  that  data.  In  the  hrst  and  last  cases,  defenses  are  pro¬ 
posed  that  are  either  evaluated  experimentally  or  analyzed  theoretically  to  provide  strong 
guarantees.  In  Part  Machine  Learning  is  leveraged  for  building  general  defenses,  and  for 
constructing  a  specihc  attack  on  a  non-adaptive  software  system.  Again,  either  strong  the¬ 
oretical  guarantees  or  extensive  experimental  evaluation  demonstrate  the  signihcant  gains 
made  by  using  learning  over  non-adaptive  approaches. 


Chapter  Organization.  The  remainder  of  this  chapter  summarizes  the  main  contribu¬ 
tions  of  this  dissertation  in  greater  detail  in  Section  7.1[  and  lists  several  open  problems  in 


the  intersection  of  Machine  Learning  and  Security  in  Section  7.2 
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7.1  Summary  of  Contributions 

The  contributions  of  this  dissertation  span  the  spectrum  of  practical  attacks  on  real 
systems,  theoretical  bounds,  new  algorithms,  and  thorough  experimental  evaluation.  We 
detail  these  contributions  to  the  state-of-the-art  in  Machine  Learning  and  Computer  Security 
research  below. 


7.1.1  Attacks  on  Learners 


The  taxonomy  of  attacks  on  Machine  Learning  systems  of  Barreno  et  al.  (2006 ),  amended 


with  the  attacker  goal  of  breaching  training  data  privacy  in  Section  |1.2.2[  considers  adver¬ 
saries  aiming  to  affect  one  of  three  security  violations:  Integrity  (False  Negative  events). 
Availability  (False  Positive  events),  and  Confidentiality  (unauthorized  access  to  informa¬ 
tion),  by  either  manipulating  the  training  data  (a  Causative  attack)  or  the  learner’s  test 
data  (an  Exploratory  attack).  The  three  chapters  of  Part  consider  Causative  attacks.  Ex¬ 
ploratory  attacks,  and  Conhdentiality  attacks  respectively.  In  the  two  former  cases,  both 
Integrity  and  Availability  goals  are  considered.  And  in  the  latter  case,  Conhdentiality  at¬ 
tacks  may  result  from  either  Causative  or  Exploratory  access  to  a  learner. 


Attacks  that  Poison  the  Training  Data.  Chapter  [^reports  on  two  large  experimental 
case-studies  on  Causative  attacks,  where  the  adversary  manipulates  the  learner  by  poisoning 


its  training  data.  In  the  hrst  case-study  on  SpamBayes  (Meyer  and  Whateley,  2004  Robin¬ 


son 


2003 ),  we  construct  poisoning  attacks  with  the  goal  of  increasing  the  False  Positive  rate 


of  the  open-source  email  spam  hlter,  as  a  Denial  of  Service  (DoS)  attack  on  the  learner  itself. 
By  contrast,  previous  work  on  attacking  statistical  spam  hlters  has  focussed  on  Exploratory 


attacks  in  which  good  words  are  inserted  into  spam  messages  (Lowd  and  Meek,  2005a ;  Wit- 


tel  and  Wu 

2004 

),  or  spammy  words  are  replaced  with  synonyms  ( 

Karlberger  et  al. 

2007 

Our  general  approach  is  to  send  messages  containing  representative  non-spam  words  to  the 
victim.  By  hagging  such  messages  as  spam,  the  victim  unwittingly  trains  SpamBayes  to 
block  legitimate  mail.  We  study  the  effect  of  adversarial  information  and  control  on  the 
effectiveness  of  our  attacks.  We  experiment  with  attacks  using  knowledge  of  the  victim’s 
native  language  (English)  by  including  an  entire  dictionary  in  the  message,  knowledge  of 
the  user’s  colloquialism’s  modeled  by  incorporating  words  from  a  Usenet  newsgroup,  or 
intimate  knowledge  of  the  tokens  used  to  train  the  hlter.  Depending  on  the  adversary’s 
knowledge,  the  attack’s  spams  can  be  better  tailored  to  the  hlter  or  can  be  shorter  in  size. 
We  also  experiment  with  varying  amounts  of  adversarial  control  over  the  training  corpus 
through  tuning  the  proportion  of  training  messages  that  are  attack  spams.  We  determine, 
for  example,  that  with  only  1%  control  over  the  training  corpus  and  intermediate  knowledge 
of  the  non-attack  training  corpus,  the  hlter’s  FPR  can  be  increased  to  40%.  At  this  point 
most  victim’s  would  shut-oh  their  hlter,  resulting  in  all  subsequent  spam  messages  being 
sent  straight  to  the  user’s  inbox.  We  also  experiment  with  Targeted  attacks  in  which  the 
adversary’s  goal  is  to  block  specihc  legitimate  messages.  Here  again  we  experiment  with 
adversarial  control  and  information  through  the  level  of  knowledge  the  adversary  possess 
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about  the  specific  message’s  tokens.  With  knowledge  of  only  30%  of  the  target  message,  the 
attack  results  in  60%  of  the  target  messages  being  incorrectly  hltered.  Finally  we  consider 
two  defenses  based  on  measuring  the  impact  of  new  messages  on  the  classiher’s  predictions 
before  inclusion  in  training,  and  dynamic  thresholds  on  the  spam  scores.  Experimental 
results  show  these  to  be  effective  counter-measures  against  our  Indiscriminate  dictionary 
attacks. 

In  the  second  case-study  of  Chapter  we  consider  Integrity  attacks  that  poison  the  train¬ 
ing  data  of  Principal  Component  Analysis  (PCA)  based  network-wide  volume  anomaly  de¬ 
tection,  with  the  aim  of  increasing  the  False  Negative  Rate  for  evasion  during  test  time.  PCA 
became  a  popular  tool  in  the  Systems  Measurement  community  for  detecting  DoS  flows  in 


2010  Lakhina  et  al. 


networks  based  on  (relatively  cheap  to  monitor)  link  traffic  volume  measurements  (Guavus 

2004il|b||2005a||b;  |Narus[  |2010|).  While  it  has  recently  been  observed 


that  PCA  can  in  certain  situations  be  sensitive  to  benign  network  faults  (Ringberg  et  al. 


2007),  our  study  is  the  hrst  to  quantify  malicious  tampering  of  PCA.  To  combat  PCA,  which 


models  the  principal  components  in  link  space  that  capture  the  maximum  variance  in  the 
training  set,  our  variance  injection  attacks  inject  chaff  into  the  network  in  such  a  way  as 
to  increase  variance  in  a  desired  direction  while  minimally  impacting  traffic  volume;  the 
goal  being  to  enable  future  evasion  of  the  detector  by  manipulating  its  model  of  normal 
traffic  patterns,  while  poisoning  covertly  so  that  the  manipulation  itself  is  not  caught.  We 
design  attacks  that  exploit  increasing  adversarial  information  about  the  underlying  network 
traffic:  uninformed  poisoning  in  which  the  attacker  cannot  monitor  traffic  and  must  add 
Bernoulli  noise  to  the  network,  locally-informed  poisoning  in  which  the  attacker  can  moni¬ 
tor  the  traffic  passing  along  a  single  ingress  link,  and  globally-informed  poisoning  where  a 
worst-case  attacker  can  monitor  all  links  of  the  network.  In  each  poisoning  scheme,  we  in¬ 
clude  a  parameter  for  tuning  the  adversary’s  control  over  the  data  in  the  form  of  the  amount 
of  traffic  injected  into  the  network.  Experiments  on  a  single  week  of  training  and  a  single 
week  of  test  data  show  that,  for  example,  the  locally-informed  scheme  can  increase  the  FNR 
seven-fold  while  increasing  the  mean  traffic  volume  on  the  links  of  the  target  flow  by  only 
10%.  As  in  the  case  of  our  SpamBayes  attacks,  increased  information  or  increased  control 
hands  the  adversary  a  distinct  advantage.  We  explore  covert  attacks  on  PCA  in  which  the 
attacker  slowly  increases  his  amount  of  poison  chaff  over  the  course  of  several  weeks.  Even 
when  allowing  PCA  to  reject  data  from  its  training  set,  a  5%  compound  growth  rate  of 
traffic  volume  per  week  resulted  in  a  13-fold  increase  of  the  FNR  to  50%  over  just  a  3  week 
period.  To  counter  our  variance-injection  attacks,  we  propose  a  detector  based  on  Robust 
Statistics,  antidote  selects  its  subspace  using  the  PCA-grid  algorithm  that  maximizes 
the  robust  MAD  estimator  of  scale  instead  of  variance,  and  a  new  Laplace  threshold.  Exper¬ 
iments  show  that  ANTIDOTE  can  halve  the  FNR  due  to  the  most  realistic  locally-informed 
poisoning  scheme  which  maintaining  the  performance  of  PCA  on  un-poisoned  data. 


Attacks  that  Query  a  Classifier  to  Evade  Detection.  In  this  dissertation’s  second 
set  of  contributions  on  attacks  on  learners.  Chapter  considers  algorithms  for  querying 
previously-trained  classihers  in  order  to  hnd  minimal-cost  instances  labeled  negative  by 
the  classiher.  In  this  theoretical  study,  we  build  on  the  abstract  model  of  the  evasion 
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problem  due  to  Lowd  and  Meek  (2005b):  given  query  access  to  a  classifier,  a  target  positive 


instance  and  a  cost  function  A  measuring  distance  from  x^,  our  goal  is  to  find  a  negative 
instance  that  almost-minimizes  A  while  submitting  only  a  small  number  of  queries  to  the 


classifier.  Lowd  and  Meek  (2005b )  showed  that  by  querying  to  learn  the  decision  boundary — 
what  we  call  reverse  engineering — an  attacker  can  evade  linear  classifiers  with  near-minimal 
Li  cost  with  query  complexity  O  (Dlog^)  for  feature  space  dimension  D  and  a  cost  of 
factor  1  -|-  e  from  optimal.  Our  work  extends  this  result  in  several  ways.  We  consider 
the  much  larger  class  of  classifiers  which  partition  feature  space  into  two  classes,  one  of 
which  is  convex;  and  we  consider  more  general  Lp  cost  functions.  For  p  <  1  our  multiline 
search  algorithm  can  evade  detection  by  classifiers  with  convex  positive  class  using  very  low 


Lowd  and  Meek 


query  complexity  O  ^log  ^  -|-  D^JXog  which  improves  on  the  result  of 
(2005b)  without  reverse  engineering  the  decision  boundary.  Moreover  a  new  lower  bound 
of  O  (log  ^  +  D)  shows  that  our  query  complexity  is  close  to  optimal.  For  the  case  of 
p  >  1,  evasion  for  convex  positive  classes  takes  exponential  query  complexity  to  achieve 
good  approximations.  This  result  is  a  threshold-type  phenomenon  where  the  p  threshold  for 
query  complexity  is  at  1.  For  approximations  that  worsen  with  D  our  multiline  search  yields 
polynomial  complexity  solutions.  For  the  case  of  p  >  1  and  classifiers  with  convex  negative 


classes,  we  apply  the  geometric  random  walk-based  method  of  Bertsimas  and  Vempala 


(2004)  that  tests  for  intersections  between  convex  sets  using  a  query  oracle.  In  this  case  our 


randomized  method  has  polynomial  query  complexity  O*  log  .  An  important  corollary 
of  polynomial  complexity  for  evading  convex-inducing  classifiers  is  that  reverse  engineering 
is  sufficient  but  not  necessary  for  evasion,  and  can  be  much  harder  [i.e.,  reverse  engineering 
convex-inducing  classifiers  requires  exponential  complexity  for  some  classifiers). 


Attacks  that  Violate  Training  Data  Privacy.  As  this  dissertation’s  third  and  final 
study  on  attacks  on  learners.  Chapter  explores  privacy-preserving  learning  in  the  setting 
where  a  statistician  wishes  to  release  a  Support  Vector  Machine  (SVM)  classifier  trained 
on  a  database  of  examples,  without  disclosing  significant  information  about  any  individual 
example.  We  adopt  the  strong  definition  of  /^-differential  privacy  (Dwork,  2006)  which 
allows  an  attacker  knowledge  and/or  control  over  n  —  1  of  the  n  rows  in  the  database, 
knowledge  of  the  release  mechanism’s  mapping,  and  access  to  the  released  classifier.  Even 
in  the  presence  of  such  an  (arguably  unrealistically)  powerful  adversary  differential  privacy 
guarantees  the  privacy  of  a  single  hidden  training  example,  where  lower  guarantees  more 
privacy.  The  chapter  includes  two  positive  results  in  the  form  of  mechanisms  for  SVM  with 
finite  feature  spaces  {e.g.,  linear  SVM  and  some  nonlinear  SVMs  including  the  polynomial 
kernel  with  any  degree)  and  nonlinear  SVM  with  translation-invariant  kernels  inducing 
infinite  dimensional  feature  spaces  {e.g.,  the  RBF  or  Gaussian  kernel).  For  both  mechanisms 
we  guarantee  differential  privacy  given  sufficient  noise  is  added  to  the  SVM’s  weight  vector 
by  the  mechanism.  For  a  new  notion  of  utility,  which  states  that  a  privacy-preserving 
mechanism’s  classifier  makes  predictions  that  closely  match  those  of  the  original  SVM  (a 
property  strictly  stronger  than  good  accuracy),  we  prove  that  both  mechanisms  are  useful 
provided  that  not  too  much  noise  is  added.  Thus  we  quantify  what  is  an  intuitive  trade¬ 
off  between  privacy  and  utility  for  the  case  of  SVM  learning.  Finally  we  provide  negative 
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results  in  the  form  of  lower  bounds  that  state  that  no  mechanism  that  approximates  the 
SVM  well  (he.,  has  very  good  utility)  can  have  /9- differential  privacy  for  low  values  of 
/9:  he.,  it  is  not  possible  to  have  very  high  levels  of  utility  and  privacy  simultaneously. 
This  work  makes  several  contributions  within  the  area  of  privacy-preserving  learning  and 
statistical  databases.  First  our  mechanisms  respond  with  parametrizations  of  functions  that 
can  belong  to  classes  of  inhnite  VC-dimension;  by  contrast  existing  mechanisms  parametrize 
scalars  or  vectors,  or  in  a  few  cases  relatively  simple  functions.  Second  the  SVM  does  not 
reduce  to  simple  subset-sum  computations,  and  so  does  not  £t  within  the  most  common 
technique  for  calculating  sensitivity  and  proving  differential  privacy.  Third  the  SVM  is  a 
particularly  practical  learning  method,  while  in  many  previous  studies  the  emphasis  was  on 
deep  analysis  of  more  simplistic  learning  methods.  Finally  our  proofs  draw  new  connections 
between  privacy,  algorithmic  stability  and  large-scale  learning. 

7.1.2  Learning  for  Attack  and  Defense 

Just  as  Security  and  Privacy  offers  potential  benehts  for  improving  applicability  and 
understanding  of  Machine  Learning  methods  in  practice.  Machine  Learning  can  be  leveraged 
to  build  effective  defenses  or  to  construct  attacks  on  large  software  systems.  Part  of 
this  dissertation  explores  two  case-studies  in  which  Machine  Learning  can  be  used  to  build 
powerful  defenses  and  attacks,  greatly  improving  on  the  performance  of  related  non-adaptive 
approaches. 

Learning-Based  Risk  Management.  Chapter  serves  as  a  case-study  in  applying  Ma¬ 
chine  Learning  as  a  defense,  where  known  guarantees  from  the  learning  literature  have 
import  consequences  in  Security.  In  the  CISO  problem,  a  Chief  Information  Security  Officer 
(CISO)  must  allocate  her  security  budget  over  her  organization  with  the  goal  of  minimiz¬ 
ing  the  attacker’s  additive  proht  or  multiplicative  return  on  attack  (ROA).  We  model  this 
incentives-based  risk  management  problem  as  a  repeated  game  on  a  graph  where  nodes  rep¬ 
resent  states  that  can  be  reached  by  the  attacker  {e.g.,  root  access  to  a  server),  and  edges 
represent  actions  the  attacker  can  take  to  reach  new  nodes  {e.g.,  exploit  a  buffer  overflow). 
Upon  each  attack,  the  adversary  chooses  a  subgraph  to  attack.  From  reaching  certain  nodes 
in  the  graph,  the  attacker  can  receive  a  payoff.  On  the  other  hand  budget  allocated  by  the 
CISO  to  edges  result  in  a  cost  incurred  by  the  attacker.  Moreover  some  edges  are  more  diffi¬ 
cult  to  defend  than  others,  in  the  sense  that  more  budget  must  be  allocated  by  the  CISO  in 
order  to  force  the  same  cost  on  the  adversary.  By  viewing  the  risk  management  problem  as 
a  repeated  game,  we  model  high-level  organizations,  systems,  or  country-level  cyber-warfare 
battlehelds  in  which  attackers  repeatedly  attack  the  system  causing  damage  to  the  organi¬ 
zation  but  not  critical  organization-ending  damage.  We  construct  a  reactive  risk  manager 
using  the  exponential  weights  algorithm  from  Online  Learning  Theory  which  learns  to  al¬ 
locate  budget  according  to  past  attacks.  Using  a  reduction  to  known  regret  bounds  from 
learning  theory,  we  show  that  the  average  profit  or  ROA  enjoyed  by  an  adversary  attacking 
the  reactive  defender  approaches  that  of  an  adversary  attacking  any  hxed  strategy.  In  par¬ 
ticular  this  includes  the  rational  proactive  risk  manager  that  is  aware  of  the  vulnerabilities 
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(edges)  in  her  organization,  the  valuation  of  the  attacker,  and  uses  this  information  to  play 
a  minimax  strategy  over  the  course  of  the  game.  By  contrast,  through  a  simple  modification 
to  the  exponential  weights  algorithm,  the  reactive  defender  need  not  know  any  vulnerabili¬ 
ties  (edges)  before-hand.  This  result  is  at  odds  with  the  conventional  security  wisdom  that 
reactive  defense  is  akin  to  myopic  bug  chasing  and  is  almost  always  inferior  to  proactive  ap¬ 
proaches.  Moreover  we  show  that  in  several  realistic  situations,  the  reactive  defender  vastly 
outperforms  the  fixed  proactive  defender.  Example  situations  include  when  the  proactive 
defender  minimizes  attacker  profit  instead  of  ROA  (or  vice  versa),  when  the  attacker  is  not 
rational,  and  when  the  proactive  defender  makes  incorrect  assumptions  about  the  attacker’s 
valuations. 

Learning-Based  Attacks  on  Open-Source  Software.  As  detailed  in  Chapter  in 
order  to  effectively  attack  open-source  software  systems,  we  can  use  Machine  Learning  to 
exploit  features  of  the  source  code  that  lands  in  public  repositories  long  before  users’  systems 
are  patched  for  vulnerabilities.  We  consider  Firefox  3  as  a  representative  open-source  system, 
where  security  patches  and  non-security  patches  land  in  the  project’s  public  trunk  in  between 
(roughly)  monthly  releases  of  the  project.  Upon  each  minor  release  of  Firefox,  Mozilla 
retroactively  discloses  discovered  vulnerabilities  in  the  previous  version  of  the  software  which 
are  patched  in  the  latest  release.  We  use  these  disclosures  to  label  the  source  code  available 
in  the  repository  up  to  the  latest  release.  Using  features  of  the  source  code  change-sets 
such  as  author,  time  the  patch  landed  in  the  repository,  top-level  directory  of  the  project 
and  hle-types  most  effected  by  the  diff,  and  diff  size,  we  can  train  a  discriminative  model 
to  differentiate  between  non-security  and  security  patches.  As  new  patches  land  in  the 
repository  we  use  the  model  to  rank  them  according  to  the  likelihood  of  fixing  a  vulnerability. 
An  attacker  using  this  ranking  would  then  examine  the  patches  by-hand  (or  perhaps  with 
expensive  program  analysis)  until  a  security  patch  is  found.  The  benefit  of  this  attack  is 
that  the  window  of  vulnerability  is  extended  back  in  time  to  the  point  at  which  the  first 
security  patch  is  found  within  an  inter-release  period.  The  attack’s  cost  is  simply  the  number 
of  patches  that  must  be  examined  to  find  a  security  patch.  We  use  off-the-shelf  software 
for  Support  Vector  Machine  (SVM)  learning  which  requires  no  knowledge  of  the  SVM,  to 
learn  to  rank  patches  online  throughout  the  year  of  active  Firefox  3  development.  We  show 
that  after  a  warm-up  period,  for  39%  of  the  days  within  a  span  of  8  months  the  SVM- 
assisted  attacker  need  only  examine  one  or  two  patches  to  find  a  security  patch.  Moreover 
the  same  attacker,  by  examining  the  top  two  ranked  patches  every  day  during  the  same 
8  month  period  extends  her  aggregate  window  of  vulnerability  (over  all  the  inter-release 
periods  during  the  8  months)  by  5  months.  We  compare  these  results  to  an  unassisted 
attacker,  who  selects  patches  to  examine  uniformly  at  random.  Finally  we  propose  that 
Mozilla  alter  their  vulnerability  life-cycle  by  landing  security  patches  in  a  private  release 
branch  instead  of  the  public  trunk.  The  cost  of  such  a  counter-measure  (which  can  be 
mitigated  by  opening  the  private  branch  prior  to  the  next  release)  is  to  quality  assurance, 
since  security  patches  would  not  be  tested  as  thoroughly  and  bugs  could  be  introduced  by 
merging  branches  further  into  the  process. 
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7.2  Open  Problems 

While  the  research  presented  here  makes  several  contributions  to  Machine  Learning  and 
Security  as  described  above,  a  number  of  new  problems  remain  open. 


7.2.1  Adversarial  Information  and  Control 


Section  defined  the  adversarial  capabilities  of  information  and  control  as  the  infor¬ 
mation  available  to  the  adversary  regarding  the  learning  map,  learner  state,  feature  space 
and  benign  data  generation  process,  and  the  amount  of  control  the  adversary  can  exert  over 
the  learner’s  training  or  test  data. 

A  common  theme  throughout  this  dissertation  is  that  the  level  of  adversarial  informa¬ 
tion  and  control  governing  an  adversary  greatly  affects  the  performance  of  that  adversary’s 
attacks  on  a  learning  system. 

Similar  conclusions  can  be  made  based  on  previous  studies.  For  example,  [Kearns  and 


Li  (1993)  characterized  the  effects  of  a  /3  proportion  of  malicious  noise  on  learning  under 
the  PAC  model  (Valiant,  1984).  They  also  bound  the  maximum  level  (3*  of  malicious  noise 
under  which  learnability  is  still  possible.  In  Robust  Statistics,  the  notion  of  break  down 
point  is  similar,  characterizing  the  proportion  of  a  sample  that  can  be  arbitrarily  corrupted 
without  an  estimator  being  forced  to  diverge  to  cx).  However,  in  these  cases  and  many  others 
including  standard  Online  Learning  Theory  results,  worst-case  assumptions  are  made  on  the 
adversary.  In  Online  Learning  the  adversary  has  complete  information  and  control;  and  for 


breakdown  points  and  in  the  work  of  Kearns  and  Li  (1993),  the  data  that  is  corrupted  gets 
corrupted  arbitrarily. 

In  reality,  as  is  highlighted  by  the  two  case-studies  of  Chapter]^  useful  threat  models  tend 
to  limit  the  adversary’s  capabilities.  As  described  in  the  examples  of  Section  |1.3[  the  form 


of  information  or  control  possessed  by  the  adversary  can  vary  greatly  from  one  adversarial 
domain  to  another.  For  example,  in  the  email  spam  domain  control  corresponds  to  inserting 
messages  of  label  spam  into  the  training  corpus  without  altering  existing  training  messages. 
In  the  network  intrusion  domain,  by  contrast,  control  is  most  naturally  viewed  as  adding 
volume  to  a  small  number  of  columns  of  the  link  traffic  matrix  (corresponding  to  injecting 
chaff  into  a  single  flow).  Thus  we  may  model  these  domains,  and  perhaps  many  others,  by  a 
benign  data  generation  process  whose  output  is  corrupted  according  to  some  transformation 
T  E  Te  before  being  revealed  to  the  learner.  In  the  above  examples,  Te  corresponds  to  the 
set  of  possible  transformations  the  adversary  could  apply,  having  level  of  control  6.  For 
example  this  may  be  injections  of  a  proportion  6  of  spam  messages,  or  addition  of  chaff 
of  average  volume  0  to  a  flow  matrix.  The  adversary  (or  perhaps  an  adversarial  ‘nature’) 
selects  which  element  of  T  is  used.  We  might  assume  that  the  learner  is  aware  of  the  family 
of  transformations  Te  but  not  the  particular  T. 

One  open  question  is  to  represent  typical  forms  of  adversarial  control  found  in  real-world 
applications  of  Machine  Learning.  Given  such  example  T’s  we  may  ask  how  much  control 
is  too  much. 


Open  Problem  67.  For  a  given  transformation  class  Te,  under  what  levels  of  control  6  is 


173 


learning  possible? 

We  may  further  want  to  characterize  which  forms  of  control  are  tolerable  and  which  are 
not. 

Open  Problem  68.  What  combinatorial  properties  of  transformation  classes  T  character¬ 
ize  learnability  under  transformations  T  E  T? 

In  addition  to  control,  we  may  wish  to  understand  the  fundamental  benehts  to  the 
attacker  of  adversarial  information.  In  this  case  one  could  form  a  repeated  game  as  is 
typical  in  Online  Learning.  However  again  we  could  ask  that  benign  data  come  from  some 
data  generation  process.  This  time  the  adversary,  prior  to  transforming  the  data,  is  given 
access  to  some  limited  view  of  the  data  V  E  V.  For  example  in  the  email  spam  domain 
our  attackers  had  access  to  a  sample  (of  a  Usenet  newsgroup)  drawn  from  a  distribution 
like  that  which  generated  the  training  corpus.  The  level  of  information  0  corresponded  to 
the  similarity  between  the  adversary’s  and  victim’s  generating  distributions.  In  the  network 
anomaly  detection  problem,  the  adversary  may  be  able  to  view  some  subset  of  the  link  traffic 
matrix  columns.  The  adversary  may  not  know  which  view  V  eV^  will  be  in  effect,  but  she 
may  have  knowledge  of  including  an  estimate  of  0.  Thus  the  adversary  is  given  access 
to  the  image  of  the  data  under  some  V  E  after  which  she  applies  the  transformation 
T  E  Te  to  the  data.  Of  course  this  transformation  (or  the  entire  family)  may  depend  on 
the  information  the  attacker  received.  The  learner  is  then  revealed  the  corrupted  data  as 
before.  We  may  ask  similar  questions  to  above. 

Open  Problem  69.  Under  what  levels  of  information  0  is  learning  possible,  for  given 
information  class  ? 

Open  Problem  70.  What  combinatorial  properties  of  information  class  characterize 
learnability?  What  about  when  the  data  revealed  to  the  learner  is  also  transformed  by  some 
T  E  Te? 

Finally,  it  would  be  interesting  to  understand  the  relationship  between  information, 
control  and  learnability. 

Open  Problem  71.  What  are  the  trade-offs  between  learnability,  adversarial  information, 
and  adversarial  control?  Both  for  varying  levels  0  and  6,  but  also  for  different  classes  of 
information  and  control. 

These  questions  would  make  interesting  extensions  to  the  basic  Online  Learning  Theory 
model,  for  example.  Similar  modihcations  to  ideas  of  influence  and  breakdown  points  could 
also  be  conceivably  made  in  Robust  Statistics. 

7.2.2  Covert  Attacks 

Several  attacks  on  Machine  Learning  have  been  constructed  in  this  dissertation:  the 
Causative  Availability  attacks  on  the  SpamBayes  email  spam  hlter,  the  Causative  Integrity 
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attacks  on  the  PCA-based  network-wide  anomaly  detector,  and  the  Exploratory  attacks  on 
convex-inducing  classifiers.  In  each  case,  our  motivation  for  limiting  the  attack  intensity 
was  implicitly  related  to  desiring  a  covert  attack.  Stealthiness  was  explicitly  measured  in 
the  case-study  on  poisoning  PCA,  in  which  during  the  Boiling  Frog  poisoning  attacks,  we 
allowed  the  model  (PCA  and  antidote)  to  reject  traffic  from  being  included  in  the  training 
set.  The  lower  the  level  of  rejection,  the  more  stealthy  the  level  of  poisoning.  In  the  evasion 
problem,  we  motivated  the  secondary  goal  of  low  query  complexity  by  the  desire  to  be  covert. 
Intuitively,  given  too  many  queries,  the  classifier  may  become  suspicious  that  an  attack  is 
underway.  An  interesting  direction  for  future  research  is  attacks  that  are  covert  by  design. 

Open  Problem  72.  What  are  good  general  strategies  for  designing  attacks  on  learners  that 
are  covert? 

One  way  to  view  the  stealthiness  question  is  as  a  kind  of  complement  of  designing  learners 
that  have  good  worst-case  performance.  For  example  the  reactive  risk  management  strategy 
of  Chapter  came  with  guarantees  about  its  performance  for  any  sequence  of  attacks.  In 
this  way  we  could  be  confident  that  it  would  perform  well  as  a  defense. 

Open  Problem  73.  Consider  a  learning-based  defender  that  is  periodically  re-trained;  dur¬ 
ing  training,  the  previously  learned  model  is  used  to  reject  training  data  that  appears  to  be 
malicious.  How  can  data  poisoning  attacks  be  designed  with  guarantees  on  stealthiness  i.e., 
guarantees  on  a  minimal  level  of  poisoned  data  being  included  in  the  training  set  despite  the 
learner’s  best  efforts? 

We  have  used  stealthiness  to  justify  limited  adversarial  control  in  the  learner’s  threat 
model.  Conversely  it  is  clear  that  some  forms  of  adversarial  control  will  be  stealthier  than 
others.  On  the  other  hand,  those  forms  of  control  that  are  stealthy  would  likely  have  less 
of  a  manipulative  effect  on  the  learner. 

Open  Problem  74.  What  trade-offs  between  stealthiness  and  learner  manipulation  are 
possible? 


7.2.3  Privacy-Preserving  Learning 

Chapter  drew  several  new  connections  between  differential  privacy  and  learning  the¬ 
ory.  To  prove  that  our  mechanisms  preserve  differential  privacy  we  computed  the  global 
sensitivity  of  the  SVM’s  primal  solution  using  results  from  algorithmic  stability. 


Open  Problem  75.  What  other  stable  learning  algorithms  can  be  made  to  preserve  differ¬ 
ential  privacy  using  technigues  similar  to  our  own? 


Various  notions  of  stability  have  been  shown  to  be  necessary  and/or  sufficient  for  learn- 
ability  ([Bousquet  and  Elisseeff  2002;  Mukherjee  et  ah,  2006).  How  could  such  results  be 
connected  to  differential  privacy? 
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Open  Problem  76.  What  is  the  relationship  between  algorithms  that  learn  (e.g.,  are 
consistent)  and  those  that  can  be  made  differentially  private  with  minimal  perturbations?  In 
particular,  can  necessary  notions  of  stability  for  learnability  be  used  to  achieve  differential 
privacy? 


If  the  answer  is  positive,  then  learnability  implies  differential  privacy  (with  minimal 
changes  to  the  algorithm).  Many  similar  qnestions  to  these  exist,  snch  as  what  are  the 
fnndamental  trade-offs  between  privacy  and  ntility?  And  how  can  Robnst  Statistics  be 
related  to  differential  privacy  in  general,  extending  the  resnlts  of  Dwork  and  Lei  (2009)? 


176 


Bibliography 


Nabil  R.  Adam  and  John  C.  Worthmann.  Secnrity-control  methods  for  statistical  databases: 
a  comparative  study.  ACM  Computing  Surveys,  21(4):515-556,  1989. 

Lucas  Adamski.  Security  severity  ratings,  2009.  https://wiki.mozilla.org/Security_ 
Sever ity_Ratings  [Online;  accessed  6-May-2010]. 

Alekh  Agarwal,  Peter  Bartlett,  and  Max  Dama.  Optimal  allocation  strategies  for  the  dark 
pool  problem.  In  Proceedings  of  the  Thirteenth  International  Conference  on  Artificial 
Intelligence  and  Statistics  (AISTATS’2010),  volume  9  of  Journal  of  Machine  Learning 
Research,  pages  9-16,  2010. 

Eugene  Agichtein,  Eric  Brill,  and  Susan  Dumais.  Improving  web  search  ranking  by  incorpo¬ 
rating  user  behavior  information.  In  Proceedings  of  the  29th  Annual  International  ACM 
SICIR  Conference  on  Research  and  Development  in  Information  Retrieval  (SICIR  ’06), 
pages  19-26,  2006. 

Ross  Anderson.  Why  information  security  is  hard — An  economic  perspective.  Proceedings 
of  the  nth  Annual  Computer  Security  Applications  Conference  (ACS AC  ’01),  pages  358- 
365,  2001. 

Jaime  Arguello,  Fernando  Diaz,  Jamie  Callan,  and  Jean-Francois  Crespo.  Sources  of  evi¬ 
dence  for  vertical  selection.  In  Proceedings  of  the  32nd  International  ACM  SICIR  Confer¬ 
ence  on  Research  and  Development  in  Information  Retrieval  (SICIR  ’09),  pages  315-322, 
2009. 

Terrence  August  and  Tunay  I.  Tunca.  Network  software  security  and  user  incentives.  Man¬ 
agement  Science,  52(11):1703-1720,  2006. 

Paramvir  Bahl,  Ranveer  Chandra,  Albert  Greenberg,  Srikanth  Kandula,  David  A.  Maltz, 
and  Ming  Zhang.  Towards  highly  reliable  enterprise  network  services  via  inference  of 
multi-level  dependencies.  In  Proceedings  of  the  ACM  SICCOMM  2007  Conference  on 
Applications,  Technologies,  Architectures,  and  Protocols  for  Computer  Communications, 
pages  13-24,  2007. 

Pierre  Baldi  and  Spren  Brunak.  Bioinformatics:  the  machine  learning  approach.  MIT  Press, 
Cambridge,  MA,  USA,  2001. 


177 


Boaz  Barak,  Kamalika  Chaudhuri,  Cynthia  Dwork,  Satyen  Kale,  Frank  McSherry,  and  Ku- 
nal  Talwar.  Privacy,  accnracy,  and  consistency  too:  a  holistic  solntion  to  contingency  table 
release.  In  Proceedings  of  the  Twenty-Sixth  ACM  SIGMOD-SIGACT-SIGART  Symposium 
on  Principles  of  Database  Systems  (PODS  ’07),  pages  273-282,  2007. 

Michael  Barbaro  and  Tom  Zeller  Jr.  A  face  is  exposed  for  aol  searcher  no.  4417749.  New 
York  Times.  Ang  9,  2006. 

Marco  Barreno,  Blaine  Nelson,  Rnssel  Sears,  Anthony  D.  Joseph,  and  J.  D.  Tygar.  Can 
machine  learning  be  secnre?  In  Proceedings  of  the  ACM  Symposium  on  InformAtion, 
Computer  and  Communications  Security  (ASIACCS’06),  pages  16-25,  2006. 

Marco  Barreno,  Blaine  Nelson,  Anthony  D.  Joseph,  and  J.  D.  Tygar.  The  secnrity  of  machine 
learning.  Machine  Learning,  2010.  to  appear. 

Marco  Antonio  Barreno.  Evaluating  the  security  of  machine  learning  algorithms.  Disserta¬ 
tion  UCB/EECS-2008-63,  Department  of  Electrical  Engineering  and  Computer  Sciences, 
University  of  California  at  Berkeley,  2008. 

Chris  Beard.  Introducing  Test  Pilot,  March  2008.  http://labs.mozilla.coni/2008/03/ 
introducing- test-pilot/  [Online;  accessed  6-May-2010]. 

Dimitris  Bertsimas  and  Santosh  Vempala.  Solving  convex  programs  by  random  walks.  Jour¬ 
nal  of  the  ACM,  51(4):540-556,  2004. 

Christopher  M.  Bishop.  Pattern  Recognition  and  Machine  Learning.  Springer- Verlag,  2006. 

Avrim  Blum,  Cynthia  Dwork,  Frank  McSherry,  and  Kobbi  Nissim.  Practical  privacy:  the 
SuLQ  framework.  In  Proceedings  of  the  Twenty-Fourth  ACM  SIGMOD-SIGACT-SIGART 
Symposium  on  Principles  of  Database  Systems  (PODS  ’05),  pages  128-138,  2005. 

Avrim  Blum,  Katrina  Ligett,  and  Aaron  Roth.  A  learning  theory  approach  to  non- interactive 
database  privacy.  In  Proceedings  of  the  fOth  Annual  ACM  Symposium  on  Theory  of 
Computing  (STOC  ’08),  pages  609-618,  2008. 

Peter  Bodik,  Rean  Griffith,  Charles  Sutton,  Armando  Fox,  Michael  I.  Jordan,  and  David  A. 
Patterson.  Statistical  machine  learning  makes  automatic  control  practical  for  internet  dat¬ 
acenters.  In  Proceedings  of  the  Workshop  on  Hot  Topics  in  Cloud  Computing  (HotCloud 
’09),  2009. 

Peter  Bodik,  Moises  Goldszmidt,  Armando  Fox,  Dawn  B.  Woodard,  and  Hans  Andersen. 
Fingerprinting  the  datacenter:  Automated  classihcation  of  performance  crises.  In  Pro¬ 
ceedings  of  Euro Sys  2010,  2010.  To  appear. 

Olivier  Bousquet  and  Andre  Elisseeff.  Stability  and  generalization.  Journal  of  Machine 
Learning  Research,  2(Mar):499-526,  2002. 


178 


Stephen  Boyd  and  Lieven  Vandenberghe.  Convex  Optimization.  Cambridge  University 
Press,  2004. 

Daniela  Brauckhoff,  Kave  Salamatian,  and  Martin  May.  Applying  PCA  for  traffic  anomaly 
detection:  Problems  and  solutions.  In  Proceedings  of  the  28th  IEEE  International  Con¬ 
ference  on  Computer  Communications  (INEOCOM  2009),  pages  2866-2870,  2009. 

Michael  P.  S.  Brown,  William  Noble  Grundy,  David  Lin,  Nello  Cristianini,  Charles  Walsh 
Sugnet,  Terrence  S.  Furey,  Manuel  Ares,  Jr.,  and  David  Haussler.  Knowledge-based 
analysis  of  microarray  gene  expression  data  by  using  support  vector  machines.  Proceedings 
of  the  National  Academy  of  Sciences,  97(l):262-267,  2000. 

David  Brumley,  Pongsin  Poosankam,  Dawn  Song,  and  Jiang  Zheng.  Automatic  patch-based 
exploit  generation  is  possible:  Techniques  and  implications.  In  Proceedings  of  the  2008 
IEEE  Symposium  on  Security  and  Privacy  (SP  ’08),  pages  143-157,  2008. 

Christopher  J.  C.  Burges.  A  tutorial  on  support  vector  machines  for  pattern  recognition. 
Data  Mining  and  Knowledge  Discovery,  2(2):121-167,  1998. 

Huseyin  Cavusoglu,  Srinivasan  Raghunathan,  and  Wei  Yue.  Decision-theoretic  and  game- 
theoretic  approaches  to  IT  security  investment.  Journal  of  Management  Information 
Systems,  25(2):281-304,  2008. 

Nicolb  Cesa-Bianchi  and  Gabor  Lugosi.  Prediction,  Learning,  and  Carnes.  Cambridge 
University  Press,  2006. 

Nicolb  Cesa-Bianchi,  Yoav  Freund,  David  P.  Helmbold,  David  Haussler,  Robert  E.  Schapire, 
and  Manfred  K.  Warmuth.  How  to  use  expert  advice.  In  Proceedings  of  the  Twenty-Pifth 
Annual  ACM  Symposium  on  Theory  of  Computing,  pages  382-391,  1993. 

Nicolb  Cesa-Bianchi,  Yoav  Freund,  David  Haussler,  David  P.  Helmbold,  Robert  E.  Schapire, 
and  Manfred  K.  Warmuth.  How  to  use  expert  advice.  Journal  of  the  Association  for 
Computing  Machinery,  44(3):427-485,  May  1997. 

Deeparnab  Chakrabarty,  Aranyak  Mehta,  and  Vijay  V.  Vazirani.  Design  is  as  easy  as  opti¬ 
mization.  In  Proceedings  of  the  33rd  International  Colloguium  on  Automata,  Languages 
and  Programming  (ICALP),  volume  Part  I  of  LNCS  4051,  pages  477-488,  2006. 

Chih-Chung  Chang  and  Chih-Jen  Lin.  LIBSVM:  a  library  for  support  vector  machines,  2001. 
Software  available  at  http://www.csie.ntu.edu.tw/~cjlin/libsvni  [Online;  accessed 
5-May-2010]. 

Kamalika  Chaudhuri  and  Claire  Monteleoni.  Privacy-preserving  logistic  regression.  In  Ad¬ 
vances  in  Neural  Information  Processing  Systems  21,  pages  289-296,  2009. 


179 


Yu-Chung  Cheng,  Mikhail  Afanasyev,  Patrick  Verkaik,  Peter  Benko,  Jennifer  Chiang, 
Alex  C.  Snoeren,  Stefan  Savage,  and  Geoffrey  M.  Voelker.  Automating  cross-layer  diag¬ 
nosis  of  enterprise  wireless  networks.  In  Proceedings  of  the  ACM  SIGCOMM  2007  Con¬ 
ference  on  Applications,  Technologies,  Architectures,  and  Protocols  for  Computer  Com¬ 
munications,  pages  25-36,  2007. 

Simon  P.  Chung  and  Aloysius  K.  Mok.  Allergy  attack  against  automatic  signature  gener¬ 
ation.  In  Proceedings  of  the  International  Symposium  on  Recent  Advances  in  Intrusion 
Detection  (RAID),  pages  61-80,  September  2006. 

Simon  P.  Chung  and  Aloysius  K.  Mok.  Advanced  allergy  attacks:  Does  a  corpus  really 
help?  In  Proceedings  of  the  International  Symposium  on  Recent  Advances  in  Intrusion 
Detection  (RAID),  pages  236-255,  September  2007. 

Massimiliano  Ciaramita,  Vanessa  Murdock,  and  Vassilis  Plachouras.  Online  learning  from 
click  data  for  sponsored  search.  In  Proceeding  of  the  1 7th  International  Conference  on 
World  Wide  Weh  (WWW  ’08),  pages  227-236,  2008. 

Gordon  Cormack  and  Thomas  Lynam.  Spam  corpus  creation  for  TREC.  In  Proceedings  of 
the  Conference  on  Email  and  Anti-Spam  (CEAS),  July  2005. 

Marco  Cremonini.  Evaluating  information  security  investments  from  attackers  perspective: 
the  return-on-attack  (ROA).  In  Eourth  Workshop  on  the  Economics  of  Information  Se¬ 
curity,  2005. 

Nello  Cristianini  and  John  Shawe-Taylor.  An  Introduction  to  Support  Vector  Machines. 
Cambridge  University  Press,  2000. 

Christophe  Croux  and  Anne  Ruiz-Gazen.  A  fast  algorithm  for  robust  principal  components 
based  on  projection  pursuit.  In  Proceedings  in  Computational  Statistics  (Compstat’96), 
pages  211-216,  1996. 

Christophe  Croux  and  Anne  Ruiz-Gazen.  High  breakdown  estimators  for  principal  compo¬ 
nents:  the  projection-pursuit  approach  revisited.  Journal  of  Multivariate  Analysis,  95(1), 
2005. 

Christophe  Croux,  Peter  Filzmoser,  and  M.  Rosario  Oliveira.  Algorithms  for  projection- 
pursuit  robust  principal  component  analysis.  Chemometrics  and  Intelligent  Laboratory 
Systems,  87(2),  2007. 

Hengjian  Cui,  Xuming  He,  and  Kai  W.  Ng.  Asymptotic  distributions  of  principal  compo¬ 
nents  based  on  robust  dispersions.  Biometrika,  90(4):953-966,  2003. 

Nilesh  Dalvi,  Pedro  Domingos,  Mausam,  Sumit  Sanghai,  and  Deepak  Verma.  Adversarial 
classihcation.  In  Proceedings  of  the  Tenth  ACM  SICKDD  International  Conference  on 
Knowledge  Discovery  and  Data  Mining  (KDD’04),  pages  99-108,  2004. 


180 


Susan  J.  Devlin,  Ramanathan  Gnanadesikan,  and  Jon  R.  Kettenring.  Robust  estimation 
of  dispersion  matrices  and  principal  components.  Journal  of  the  American  Statistical 
Association,  76(374):354-362,  1981. 

Luc  P.  Devroye  and  T.  J.  Wagner.  Distribution-free  performance  bounds  for  potential 
function  rules.  IEEE  Transactions  on  Information  Theory,  25(5):601-604,  1979. 

frit  Dinur  and  Kobbi  Nissim.  Revealing  information  while  preserving  privacy.  In  Proceedings 
of  the  Twenty-Second  ACM  SIGMOD-SIGACT-SIGART  Symposium  on  Principles  of 
Database  Systems  (PODS  ’03),  pages  202-210,  2003. 

Pat  Doyle,  Julia  I.  Lane,  Jules  J.  M.  Theeuwes,  and  Laura  V.  Zayatz,  editors.  Confidentiality, 
Disclosure  and  Data  Access:  Theory  and  Practical  Application  for  Statistical  Agencies. 
Elsevier,  2001. 

Isabel  Drost  and  Tobias  Scheffer.  Thwarting  the  nigritude  ultramarine:  Learning  to  identify 
link  spam.  In  Proceedings  of  the  European  Conference  on  Machine  Learning  (ECML  ’05), 
pages  96-107,  2005. 

Thomas  Duebendorfer  and  Stefan  Frei.  Why  silent  updates  boost  security.  Tech  Report 
TIK  302,  ETH,  2009. 

Cynthia  Dwork.  Differential  privacy.  In  Proceedings  of  the  33rd  International  Colloguium 
on  Automata,  Languages  and  Programming  (ICALP),  pages  1-12,  2006. 

Cynthia  Dwork.  Differential  privacy:  A  survey  of  results.  In  Proceedings  of  the  5th  In¬ 
ternational  Conference  on  Theory  and  Applications  of  Models  of  Computation  (TAMC), 
volume  4978  of  Lecture  Notes  in  Computer  Science,  pages  1-19,  2008. 

Cynthia  Dwork.  A  hrm  foundation  for  private  data  analysis.  Communications  of  the  ACM, 
2010.  to  appear. 

Cynthia  Dwork  and  Jing  Lei.  Differential  privacy  and  robust  statistics.  In  Proceedings  of 
the  fist  Annual  ACM  Symposium  on  Theory  of  Computing  (STOC  ’09),  pages  371-380, 
2009. 

Cynthia  Dwork  and  Sergey  Yekhanin.  New  efficient  attacks  on  statistical  disclosure  control 
mechanisms.  In  Proceedings  of  the  28th  Annual  Conference  on  Cryptology  (CRYPTO 
2008),  pages  469-480,  2008. 

Cynthia  Dwork,  Frank  McSherry,  Kobbi  Nissim,  and  Adam  Smith.  Calibrating  noise  to 
sensitivity  in  private  data  analysis.  In  Proceedings  of  the  3rd  Theory  of  Cryptography 
Conference  (TCC  2006),  pages  265-284,  2006. 

Cynthia  Dwork,  Frank  McSherry,  and  Kunal  Talwar.  The  price  of  privacy  and  the  limits  of 
LP  decoding.  In  Proceedings  of  the  Thirty-Ninth  Annual  ACM  Symposium  on  Theory  of 
Computing  (STOC  ’07),  pages  85-94,  2007. 


181 


Cynthia  Dwork,  Moni  Naor,  Omer  Reingold,  Guy  N.  Rothblum,  and  Salil  Vadhan.  On  the 
complexity  of  differentially  private  data  release:  efficient  algorithms  and  hardness  results. 
In  Proceedings  of  the  fist  Annual  ACM  Symposium  on  Theory  of  Computing  (STOC  ’09), 
pages  381-390,  2009. 

Martin  Dyer  and  Alan  Frieze.  Computing  the  volume  of  convex  bodies:  A  case  where 
randomness  provably  helps.  In  Proceedings  of  the  AMS  Symposium  on  Probabilistic  Com¬ 
binatorics  and  Its  Applications,  pages  123-170,  1992. 

Usama  Mohammad  Fayyad.  On  the  induction  of  decision  trees  for  multiple  concept  learning. 
PhD  thesis.  University  of  Michigan,  Ann  Arbor,  MI,  USA,  1992. 

Darin  Fisher.  Multi-process  architecture,  July  2008.  http://dev.chromium.org/ 
developers/design-documents/multi-process-architecture  [Online;  accessed  6- 
May-2010]. 

Ronald  A.  Fisher.  Question  14:  Combining  independent  tests  of  signihcance.  American 
Statistician,  2(5):30-30J,  1948. 

Prahlad  Fogla  and  Wenke  Lee.  Evading  network  anomaly  detection  systems:  Formal  reason¬ 
ing  and  practical  techniques.  In  Proceedings  of  the  13th  ACM  Conference  on  Computer 
and  Communications  Security  (CCS’06),  pages  59-68,  2006. 

Jason  Franklin,  Vern  Paxson,  Adrian  Perrig,  and  Stefan  Savage.  An  inquiry  into  the  na¬ 
ture  and  causes  of  the  wealth  of  internet  miscreants.  In  Proceedings  of  the  2007  ACM 
Conference  on  Computer  and  Communications  Security,  pages  375-388,  2007. 

Stefan  Frei,  Thomas  Duebendorfer,  and  Bernhard  Plattner.  Firefox  (in)  security  update 
dynamics  exposed.  SICCOMM  Computer  Communication  Review,  39(l):16-22,  2009. 

Yoav  Freund  and  Robert  Schapire.  A  short  introduction  to  boosting.  Journal  of  the  Japanese 
Society  for  Artificial  Intelligence,  14(5):771-780,  1999a. 

Yoav  Freund  and  Robert  E.  Schapire.  Adaptive  game  playing  using  multiplicative  weights. 
Carnes  and  Economic  Behavior,  29:79-103,  1999b. 

Jeffrey  Friedberg.  Internet  fraud  battlefield,  April  2007.  http://www.ftc.gov/bcp/ 
workshops/proof positive/Battlef  ield_0verview.pdf  [Online;  accessed  6-May-2010]. 

Neal  Fultz  and  Jens  Grossklags.  Blue  versus  red:  Towards  a  model  of  distributed  security 
attacks.  In  Proceedings  of  the  Thirteenth  International  Conference  Financial  Cryptography 
and  Data  Security,  pages  167-183,  2009. 

Arpita  Ghosh,  Benjamin  I.  P.  Rubinstein,  Sergei  Vassilvitskii,  and  Martin  Zinkevich.  Adap¬ 
tive  bidding  for  display  advertising.  In  Proceedings  of  the  18th  International  World  Wide 
Web  Conference  (WWW  2009),  pages  251-260,  2009. 


182 


Lawrence  A.  Gordon  and  Martin  P.  Loeb.  The  economics  of  information  security  investment. 
ACM  Transactions  on  Information  and  System  Security,  5(4):438-457,  2002. 

Paul  Graham.  A  plan  for  spam,  http://www.paulgraham.coni/spam.html,  August  2002. 

Jens  Grossklags,  Nicolas  Ghristin,  and  John  Ghuang.  Secure  or  insure?:  A  game-theoretic 
analysis  of  information  security  games.  In  Proceeding  of  the  1 7th  International  Conference 
on  World  Wide  Web,  pages  209-218,  2008. 

Guavus,  2010.  http://www.guavus.com  [Online;  accessed  22- April-2010]. 

Zoltan  Gyongyi  and  Hector  Garcia-Molina.  Link  spam  alliances.  In  Proceedings  of  the  31st 
International  Conference  on  Very  Large  Data  Bases  (VLDB  ’05),  pages  517-528,  2005. 

Frank  R  Hampel,  Elvezio  M  Ronchetti,  Peter  J  Rousseeuw,  and  Werner  A  Stahel.  Robust 
Statistics:  The  Approach  Based  on  Influence  Functions.  Wiley  Series  in  Probability  and 
Mathematical  Statistics.  Wiley,  1980. 

Kjell  Hausken.  Returns  to  information  security  investment:  The  effect  of  alternative  infor¬ 
mation  security  breach  functions  on  optimal  investment  and  sensitivity  to  vulnerability. 
Information  Systems  Frontiers,  8(5):338-349,  2006. 

Michael  Hay,  Ghao  Li,  Gerome  Miklau,  and  David  Jensen.  Accurate  estimation  of  the  degree 
distribution  of  private  networks.  In  Proceedings  of  the  2009  Ninth  IEEE  International 
Conference  on  Data  Mining  (ICDM’09),  pages  169-178,  2009. 

Elad  Hazan  and  Satyen  Kale.  On  stochastic  and  worst-case  models  for  investing.  In  Advances 
in  Neural  Information  Processing  Systems  (NIPS)  22,  pages  709-717,  2010. 

Mark  Herbster  and  Manfred  K.  Warmuth.  Tracking  the  best  expert.  Machine  Learning,  32 

(2) :151-178,  1998. 

Ola  Hossjer  and  Ghristophe  Groux.  Generalizing  univariate  signed  rank  statistics  for  testing 
and  estimating  a  multivariate  location  parameter.  Journal  of  N onparametric  Statistics,  4 

(3) :293-308,  1995. 

Michael  Howard.  Attack  surface:  Mitigate  security  risks  by  minimizing  the  code  you  expose 
to  untrusted  users.  MSDN  Magazine,  November  2004. 

Ling  Huang,  Michael  1.  Jordan,  Anthony  Joseph,  Minos  Garofalakis,  and  Nina  Taft.  In- 
network  PGA  and  anomaly  detection.  In  Advances  in  Neural  Information  Processing 
Systems  19  (NIPS  2006),  pages  617-624,  2007. 

Peter.  J.  Huber.  Robust  Statistics.  Wiley  Series  in  Probability  and  Mathematical  Statistics. 
Wiley,  1981. 


183 


Nicole  Immorlica,  Kamal  Jain,  Mohammad  Mahdian,  and  Kunal  Talwar.  Click  fraud  resis¬ 
tant  methods  for  learning  click-through  rates.  In  Proceedings  of  the  First  International 
Workshop  on  Internet  and  Network  Economics  (WINE  2005),  volume  3828  of  Lecture 
Notes  in  Computer  Science,  pages  34-45,  2005. 

International  Organization  for  Standardization.  Information  technology  -  security  tech¬ 
niques  -  code  of  practice  for  information  security  management.  ISO/IEC  17799:2005, 
ISO,  2005. 

J.  Edward  Jackson  and  Govind  S.  Mudholkar.  Control  procedures  for  residuals  associated 
with  principal  component  analysis.  Technometrics,  21(3):341-349,  1979. 

Thorsten  Joachims.  Optimizing  search  engines  using  clickthrough  data.  In  Proceedings  of 
the  Eighth  ACM  SICKDD  International  Conference  on  Knowledge  Discovery  and  Data 
Mining  (KDD  ’02),  pages  133-142,  2002. 

Srikanth  Kandula,  Ranveer  Chandra,  and  Dina  Katabi.  What’s  going  on?  Learning  commu¬ 
nication  rules  in  edge  networks.  In  Proceedings  of  the  ACM  SICCOMM  2008  Conference 
on  Applications,  Technologies,  Architectures,  and  Protocols  for  Computer  Communica¬ 
tions,  pages  87-98,  2008. 

Chris  Kanich,  Christian  Kreibich,  Kirill  Levchenko,  Brandon  Enright,  Geoffrey  M.  Voelker, 
Vern  Paxson,  and  Stefan  Savage.  Spamalytics:  An  empirical  analysis  of  spam  marketing 
conversion.  In  Proceedings  of  the  2008  ACM  Conference  on  Computer  and  Communica¬ 
tions  Security,  pages  3-14,  2008. 

Khalid  Kark,  Jonathan  Penn,  and  Alissa  Dill.  2008  CISO  priorities:  The  right  objectives 
but  the  wrong  focus.  Le  Magazine  de  la  Securite  Informatigue,  April  2009. 

Christoph  Karlberger,  Gunther  Bayler,  Christopher  Kruegel,  and  Engin  Kirda.  Exploiting 
redundancy  in  natural  language  to  penetrate  Bayesian  spam  Liters.  In  Proceedings  of  the 
USENIX  Workshop  on  Offensive  Technologies  (WOOT),  pages  1-7,  August  2007. 

Shiva  Prasad  Kasiviswanathan,  Homin  K.  Lee,  Kobbi  Nissim,  Sofya  Raskhodnikova,  and 
Adam  Smith.  What  can  we  learn  privately?  In  Proceedings  of  the  49th  Annual  IEEE 
Symposium  on  Eoundations  of  Computer  Science  (EOCS  ’08),  pages  531-540,  2008. 

Michael  Kearns  and  Ming  Li.  Learning  in  the  presence  of  malicious  errors.  SIAM  Journal 
on  Computing,  22(4):807-837,  1993. 

Michael  Kearns  and  Dana  Ron.  Algorithmic  stability  and  sanity-check  bounds  for  leave- 
one-out  cross-validation.  Neural  Computation,  11:1427-1453,  1999. 

Hyang-Ah  Kim  and  Brad  Karp.  Autograph:  Toward  automated,  distributed  worm  signature 
detection.  In  Proceedings  of  the  USENIX  Security  Symposium,  pages  271-286,  2004. 

George  Kimeldorf  and  Grace  Wahba.  Some  results  on  Tchebycheffian  spline  functions. 
Journal  of  Mathematical  Analysis  and  Applications,  33(l):82-95,  1971. 


184 


Bryan  Klimt  and  Yiming  Yang.  Introdncing  the  Enron  corpus.  In  Proceedings  of  the  Con¬ 
ference  on  Email  and  Anti-Spam  (CEAS),  July  2004. 

Aleksandra  Korolova,  Krishnaram  Kenthapadi,  Nina  Mishra,  and  Alex  Ntoulas.  Releasing 
search  queries  and  clicks  privately.  In  Proceedings  of  18th  International  World  Wide  Weh 
Conference  (WWW’09),  pages  171-180,  2009. 

Vineet  Kumar,  Rahul  Telang,  and  Tridas  Mukhopadhyay.  Optimal  information  security 
architecture  for  the  enterprise,  2008.  http : //ssrn.  com/ abstract=1086690  [Online;  ac¬ 
cessed  6-May-2010]. 

Samuel  Kutin  and  Partha  Niyogi.  Almost-everywhere  algorithmic  stability  and  generaliza¬ 
tion  error.  Technical  report  TR-2002-03,  Computer  Science  Department,  University  of 
Chicago,  2002. 

Anukool  Lakhina,  Mark  Crovella,  and  Christophe  Diot.  Diagnosing  network-wide  traffic 
anomalies.  In  Proceedings  of  the  ACM  SICCOMM  2004  Conference  on  Applications, 
Technologies,  Architectures,  and  Protocols  for  Computer  Communications,  pages  219- 
230,  2004a. 

Anukool  Lakhina,  Mark  Crovella,  and  Christophe  Diot.  Characterization  of  network-wide 
anomalies  in  traffic  ffows.  In  Proceedings  of  the  fth  ACM  SICCOMM  Conference  on 
Internet  Measurement  (IMC  ’04),  pages  201-206,  2004b. 

Anukool  Lakhina,  Konstantina  Papagiannaki,  Mark  Crovella,  Christophe  Diot,  Eric  D.  Ko- 
laczyk,  and  Nina  Taft.  Structural  analysis  of  network  traffic  ffows.  SICMETRICS  Per¬ 
formance  Evaluation  Review,  32(l):61-72,  2004c. 

Anukool  Lakhina,  Mark  Crovella,  and  Christophe  Diot.  Mining  anomalies  using  traffic 
feature  distributions.  In  Proceedings  of  the  2005  Conference  on  Applications,  Technologies, 
Architectures,  and  Protocols  for  Computer  Communications  (SICCOMM  ’05),  pages  217- 
228,  2005a. 

Anukool  Lakhina,  Mark  Crovella,  and  Christophe  Diot.  Detecting  distributed  attacks  using 
network-wide  flow  traffic.  In  Proceedings  of  the  PloCon  2005  Analysis  Workshop,  2005b. 

Aleksandar  Lazarevic,  Levent  Ertoz,  Vipin  Kumar,  Aysel  Ozgur,  and  Jaideep  Srivastava. 
A  comparative  study  of  anomaly  detection  schemes  in  network  intrusion  detection.  In 
Proceedings  of  the  SIAM  International  Conference  on  Data  Mining,  pages  25-36,  2003. 

Guoying  Li  and  Zhonglian  Chen.  Projection-pursuit  approach  to  robust  dispersion  matrices 
and  principal  components:  Primary  theory  and  Monte  Carlo.  Journal  of  the  American 
Statistical  Association,  80(391) :759-766,  1985. 

Xin  Li,  Fang  Bian,  Mark  Crovella,  Christophe  Diot,  Ramesh  Govindan,  Gianluca  lannac- 
cone,  and  Anukool  Lakhina.  Detection  and  identification  of  network  anomalies  using 
sketch  subspaces.  In  Proceedings  of  the  6th  ACM  SICCOMM  Conference  on  Internet 
Measurement  (IMC  ’06),  pages  147-152,  2006a. 


185 


Xin  Li,  Fang  Bian,  Hui  Zhang,  Christophe  Diet,  Ramesh  Govindan,  Wei  Hong,  ,  and  Gian- 
Inca  lannaccone.  MIND:  A  distributed  multidimensional  indexing  for  network  diagnosis. 
In  Proceedings  of  the  25th  IEEE  International  Conference  on  Computer  Communications 
(INEOCOM  2006),  pages  1422-1433,  2006b. 

Yihua  Liao  and  V.  Rao  Vemuri.  Using  text  categorization  techniques  for  intrusion  detection. 
In  Proceedings  of  the  USENIX  Security  Symposium,  pages  51-59,  2002. 

Hsuan-Tien  Lin,  Ghih-Jen  Lin,  and  Ruby  G.  Weng.  A  note  on  Platt’s  probabilistic  outputs 
for  support  vector  machines.  Machine  Learning,  68:267-276,  2007. 

Laszlo  Lovasz  and  Santosh  Vempala.  Simulated  annealing  in  convex  bodies  and  an  0*{n‘^) 
volume  algorithm.  In  Proceedings  of  the  44th  Annual  IEEE  Symposium  on  Eoundations 
of  Computer  Science  (EOCS  ’03),  pages  650-659,  2003. 

Laszlo  Lovasz  and  Santosh  Vempala.  Hit-and-run  from  a  corner.  In  Proceedings  of  the 
Thirty-Sixth  Annual  ACM  Symposium  on  Theory  of  Computing  (STOC  ’04),  pages  310- 
314,  2004. 

Daniel  Lowd  and  Ghristopher  Meek.  Good  word  attacks  on  statistical  spam  Liters.  In 
Proceedings  of  the  Conference  on  Email  and  Anti-Spam  (CEAS),  July  2005a. 

Daniel  Lowd  and  Ghristopher  Meek.  Adversarial  learning.  In  Proceedings  of  the  Eleventh 
ACM  SICKDD  International  Conference  on  Knowledge  Discovery  in  Data  Mining  (KDD 
’05),  pages  641-647,  2005b. 

Kong-wei  Lye  and  Jeannette  M.  Wing.  Game  strategies  in  network  security.  In  Proceedings 
of  the  Eoundations  of  Computer  Security  Workshop,  pages  13-22,  2002. 

Ricardo  Maronna.  Principal  components  and  orthogonal  regression  based  on  robust  scales. 
Technometrics,  47(3):264-273,  2005. 

Frank  MeSherry  and  Ilya  Mironov.  Differentially  private  recommender  systems:  building 
privacy  into  the  net.  In  Proceedings  of  the  15th  ACM  SICKDD  International  Conference 
on  Knowledge  Discovery  and  Data  Mining  (KDD  ’09),  pages  627-636,  2009. 

Frank  MeSherry  and  Kunal  Talwar.  Mechanism  design  via  differential  privacy.  In  Proceedings 
of  the  48th  Annual  IEEE  Symposium  on  Eoundations  of  Computer  Science  (EOCS  ’07), 
pages  94-103,  2007. 

Tony  Meyer  and  Brendon  Whateley.  SpamBayes:  Effective  open-source,  Bayesian  based, 
email  classiheation  system.  In  Proceedings  of  the  Conference  on  Email  and  Anti-Spam 
(CEAS),  July  2004. 

Tom  M.  Mitchell.  Machine  Learning.  McGraw-Hill,  1997. 


186 


R.  Ann  Miura-Ko  and  Nicholas  Bambos.  SecureRank:  A  risk-based  vulnerability  man¬ 
agement  scheme  for  computing  infrastructures.  In  Proceedings  of  IEEE  International 
Conference  on  Communications,  pages  1455-1460,  2007. 

R.  Ann  Miura-Ko,  Benjamin  Yolken,  John  Mitchell,  and  Nicholas  Bambos.  Security  decision¬ 
making  among  interdependent  organizations.  In  Proceedings  of  the  21st  IEEE  Computer 
Security  Eoundations  Symposium,  pages  66-80,  2008. 

Mozilla  Foundation.  Known  vulnerabilities  in  Mozilla  products,  2010.  http://www. 
mozilla.org/security/known-vulnerabilities/  [Online;  accessed  14-January-2010]. 

Sayan  Mukherjee,  Partha  Niyogi,  Tomaso  Poggio,  and  Ryan  Rifkin.  Learning  theory:  Stabil¬ 
ity  is  sufficient  for  generalization  and  necessary  and  sufficient  for  consistency  of  empirical 
risk  minimization.  Advances  in  Computational  Mathematics,  25:161-193,  2006. 

Srinivas  Mukkamala,  Guadalupe  Janoski,  and  Andrew  Sung.  Intrusion  detection  using 
neural  networks  and  support  vector  machines.  In  Proceedings  of  the  International  Joint 
Conference  on  Neural  Networks  (IJCNN),  pages  1702-1707,  2002. 

Arvind  Narayanan  and  Vitaly  Shmatikov.  Robust  de-anonymization  of  large  sparse  datasets. 
In  Proceedings  of  the  2008  IEEE  Symposium  on  Security  and  Privacy,  pages  111-125, 
2008. 

Narus,  2010.  http://www.narus.com  [Online;  accessed  22- April-2010]. 

Nature.  Security  ethics.  Nature,  463(7278):136,  14  Jan  2010.  Editorial. 

James  Newsome,  Brad  Karp,  and  Dawn  Song.  Polygraph:  Automatically  generating  sig¬ 
natures  for  polymorphic  worms.  In  Proceedings  of  the  IEEE  Symposium  on  Security  and 
Privacy,  pages  226-241,  2005. 

James  Newsome,  Brad  Karp,  and  Dawn  Song.  Paragraph:  Thwarting  signature  learning 
by  training  maliciously.  In  Proceedings  of  the  9th  International  Symposium  on  Recent 
Advances  in  Intrusion  Detection  (RAID  2006),  pages  81-105,  2006. 

Erik  Ordentlich  and  Thomas  M.  Cover.  The  cost  of  achieving  the  best  portfolio  in  hindsight. 
Mathematics  of  Operations  Research,  23(4):960-982,  1998. 

Xinming  On,  Wayne  F.  Boyer,  and  Miles  A.  McQueen.  A  scalable  approach  to  attack  graph 
generation.  In  Proceedings  of  the  13th  ACM  Conference  on  Computer  and  Communica¬ 
tions  Security,  pages  336-345,  2006. 

PhishTank.  http://www.phishtank.com,  2010.  [Online;  accessed  13- April-2010]. 

John  P.  Pironti.  Key  elements  of  an  information  security  program.  Information  Systems 
Control  Journal,  1,  2005. 

John  Ross  Quinlan.  Induction  of  decision  trees.  Machine  Learning,  1:81-106,  1986. 


187 


Luis  Rademacher  and  Navin  Goyal.  Learning  convex  bodies  is  hard.  In  Proceedings  of  the 
22nd  Annual  Conference  on  Learning  Theory  (COLT  2009),  pages  303-308,  2009. 

Ali  Rahimi  and  Benjamin  Recht.  Random  features  for  large-scale  kernel  machines.  In 
Advances  in  Neural  Information  Processing  Systems  20,  pages  1177-1184,  2008. 

Anirudh  Ramachandran,  Nick  Feamster,  and  Santosh  Vempala.  Filtering  spam  with  behav¬ 
ioral  blacklisting.  In  Proceedings  of  the  IJ^th  ACM  Conference  on  Computer  and  Commu¬ 
nications  Security  (CCS  ’07),  pages  342-351,  2007. 

Eric  Rescorla.  Is  finding  security  holes  a  good  idea?  IEEE  Security  and  Privacy,  3(1): 
14-19,  2005. 

Thomas  C.  Rindfleisch.  Privacy,  information  technology,  and  health  care.  Communications 
of  the  ACM,  40(8):92-100,  1997. 

Haakon  Ringberg,  Augustin  Soule,  Jennifer  Rexford,  and  Christophe  Diot.  Sensitivity  of 
PCA  for  traffic  anomaly  detection.  In  Proceedings  of  the  2007  ACM  SICMETRICS  In¬ 
ternational  Conference  on  Measurement  and  Modeling  of  Computer  Systems  (SICMET¬ 
RICS ’07),  pages  109-120,  2007. 

Gary  Robinson.  A  statistical  approach  to  the  spam  problem.  Linux  Journal,  March  2003. 

Benjamin  1.  P.  Rubinstein,  Peter  L.  Bartlett,  Ling  Huang,  and  Nina  Taft.  Learning  in  a  large 
function  space:  Privacy-preserving  mechanisms  for  SVM  learning.  CoRR,  abs/0911.5708, 
2009.  Submitted  30  Nov  2009. 

Walter  Rudin.  Eourier  Analysis  on  Croups.  Wiley  Glassies  Library.  Wiley-Interscience, 
reprint  edition,  1994. 

Udam  Saini.  Machine  learning  in  the  presence  of  an  adversary:  Attacking  and  defending 
the  spambayes  spam  filter.  Dissertation  UGB/EEGS-2008-62,  Department  of  Electrical 
Engineering  and  Gomputer  Sciences,  University  of  Galifornia  at  Berkeley,  2008. 

Sriram  Sankararaman,  Guillaume  Obozinski,  Michael  1.  Jordan,  and  Eran  Halperin.  Ge¬ 
nomic  privacy  and  limits  of  individual  detection  in  a  pool.  Nature  Cenetics,  41(9):965-967, 
2009. 

Anand  D.  Sarwate,  Kamalika  Ghaudhuri,  and  Glaire  Monteleoni.  Differentially  private 
support  vector  machines.  CoRR,  abs/0912.0071,  2009.  Submitted  1  Dec  2009. 

Greg  Schohn  and  David  Gohn.  Less  is  more:  Active  learning  with  support  vector  machines. 
In  Proceedings  of  the  Seventeenth  International  Conference  on  Machine  Learning  (ICML 
2000),  pages  839-846,  2000. 

Bernhard  Scholkopf  and  Alexander  J.  Smola.  Learning  with  Kernels:  Support  Vector  Ma¬ 
chines,  Regularization,  Optimization,  and  Beyond.  Adaptive  Gomputation  and  Machine 
Learning.  MIT  Press,  2001. 


188 


Cyrus  Shaoul  and  Chris  Westbury.  A  USENET  corpus  (2005-2007),  October  2007. 

Robert  L.  Smith.  The  hit-and-run  sampler:  A  globally  reaching  Markov  chain  sampler  for 
generating  arbitrary  multivariate  distributions.  In  Proceedings  of  the  28th  Conference  on 
Winter  Simulation  (WSC  ’96),  pages  260-264,  1996. 

Augustin  Soule,  Kave  Salamatian,  and  Nina  Taft.  Combining  hltering  and  statistical  meth¬ 
ods  for  anomaly  detection.  In  Proceedings  of  the  5th  ACM  SICCOMM  Conference  on 
Internet  Measurement  (IMC  ’05),  pages  31-31,  2005. 

Salvatore  J.  Stolfo,  Shlomo  Hershkop,  Chia-Wei  Hu,  Wei-Jen  Li,  Olivier  Nimeskern,  and 
Ke  Wang.  Behavior-based  modeling  and  its  application  to  Email  analysis.  ACM  Trans¬ 
actions  on  Internet  Technology,  pages  187-221,  2006. 

Gilles  Stoltz  and  Gabor  Lugosi.  Internal  regret  in  on-line  portfolio  selection.  Machine 
Learning,  59(1-2):  125-159,  2005. 

Latanya  Sweeney,  /c-anonymity:  a  model  for  protecting  privacy.  International  Journal  on 
Uncertainty,  Fuzziness  and  Knowledge-based  Systems,  10(5):557-570,  2002. 

Kymie  M.  G.  Tan,  Kevin  S.  Killonrhy,  and  Roy  A.  Maxion.  Undermining  an  anomaly-based 
intrusion  detection  system  using  common  exploits.  In  Proceedings  of  the  5th  International 
Conference  on  Recent  Advances  in  Intrusion  Detection  (RAID’02),  pages  54-73,  2002. 

Leslie  G.  Valiant.  A  theory  of  the  learnable.  Communications  of  the  ACM,  27(11):1134- 
1142,  November  1984. 

Hal  R.  Varian.  Managing  online  security  risks.  New  York  Times.  Jun  1,  2000. 

Hal  R.  Varian.  System  reliability  and  free  riding,  2001.  Note  available  at  http://www. 
sims.berkeley.edu/~hal/Papers/2004/reliability, 

Daniel  Veditz.  Personal  commnnication,  2009. 

Daniel  Veditz.  Mozilla  secnrity  group,  2010.  http://www.mozilla.org/projects/ 
security/ secgrouplist .html. 

Shobha  Venkataraman,  Avrim  Blum,  and  Dawn  Song.  Limits  of  learning-based  signatnre 
generation  with  adversaries.  In  Proeeedings  of  the  Network  and  Distributed  System  Secu¬ 
rity  Symposium  (NDSS’2008),  2008. 

David  Wagner  and  Paolo  Soto.  Mimicry  attacks  on  host-based  intrusion  detection  systems. 
In  Proceedings  of  the  9th  ACM  Conference  on  Computer  and  Communications  Security, 
pages  255-264,  2002. 

Bernhard  Warner.  Home  PGs  rented  ont  in  sabotage-for-hire  racket.  Reuters,  July  2004. 


189 


Wikipedia.  Information  secnrity  —  Wikipedia,  The  Free  Encyclopedia,  2010. 
URL  http : / /en . Wikipedia . org/w/index . php?title=Inf ormation_security&oldid= 
355701059,  [Online;  accessed  13- April-2010]. 

Leon  Willenborg  and  Ton  de  Waal.  Elements  of  Statistical  Disclosure  Control.  Springer- 
Verlag,  2001. 

Gregory  L.  Wittel  and  S.  Felix  Wn.  On  attacking  statistical  spam  hlters.  In  Proceedings  of 
the  Conference  on  Email  and  Anti-Spam  (CEAS’Of),  2004. 

Aaron  D.  Wyner.  Capabilities  of  bonnded  discrepancy  decoding.  The  Bell  System  Technical 
Journal,  44:1061-1122,  Jul/Ang  1965. 

Yin  Zhang,  Zihni  Ge,  Albert  Greenberg,  and  Matthew  Roughan.  Network  anomography. 
In  Proceedings  of  the  5th  ACM  SICCOMM  Conference  on  Internet  Measurement  (IMC 
’05),  pages  30-30,  2005. 


